Three-essays on Learning Algorithms in Business Studies
Abstract
In this dissertation, I present new learning algorithms that are applicable to business studies. More specifically, I develop Synthetic Control Boosting with Poisson (SCB-P) and Synthetic Control Boosting with Negative Binomial (SCB-NB) for causal inference with count outcome variables. As information technology is considered a key driver of a firm’s innovation, it is important to investigate the impact of CIO turnover on a firm’s innovativeness and the spillover effect of CIO turnover on competitors’ innovativeness. To this end, I argue that the proposed methods are more suitable for count outcomes such as patent stock than the Synthetic Control Method (SCM), which is one of the traditional methods for causal inference. I mathematically prove the consistency of the proposed methods and also demonstrate the performance using a Monte Carlo simulation. Using the proposed methods, I found that firms produce more patents when their competitors experience CIO turnover than if CIOs at competing firms had not been dismissed. I also introduce novel approaches, the Zero-Inflated Probit Boost (ZIPBoost) and Zero-Inflated Logit Boost (ZILBoost) methods, for learning imbalanced data based on a two-regime process: regime 0, which generates excess zeros (majority class), and regime 1, which contributes to generating an outcome of one (minority class). I demonstrate the performance of the proposed methods using a Monte Carlo simulation and an actual data application. The results from both the simulation study and the application show that the proposed methods provide better prediction accuracy than other learning algorithms. Lastly, I propose a novel approach for modeling in the presence of Unknown uncertainty (UU), which refers to unrecognized outcomes of which an organization is unaware. I argue that it is necessary to relax the two assumptions of the conventional statistical model in order to learn UU: 1) the existence of a time-invariant data generating process and 2) the full conditional density of outcomes given observations. The Monte Carlo simulations and empirical results demonstrate that the proposed model provides excellent performance in terms of the model’s statistical power, in-sample error, and out-of-sample error. Departing from existing methods, the proposed method can correctly estimate changed parameter under UU.