Asymmetric AdaBoost for High Dimensional Maximum Score Regression

Prof. Tae-hwy Lee , Department of Economics, UCR

November 03, 2017

Adaptive Boosting or AdaBoost, introduced by Freund and Schapire (1996) has been proved to be effective to solve the high-dimensional binary classification or binary prediction problems. Friedman, Hastie, and Tibshirani (2000) show that AdaBoost builds an additive logistic regression model via minimizing the ‘exponential loss’. We show that the exponential loss in AdaBoost is equivalent (up to scale) to the symmetric maximum score (Manski 1975, 1985) and also to the symmetric least square loss for binary prediction. Therefore, the standard AdaBoost using the exponential loss is a symmetric algorithm and solves the binary median regression. In this paper, we introduce Asymmetric AdaBoost that produces an additive logistic regression model from minimizing the new ‘asymmetric exponential loss’ which we introduce in this paper. The Asymmetric AdaBoost can handle the asymmetric maximum score problem (Granger and Pesaran 2000, Lee and Yang 2006, Lahiri and Yang 2012, and Elliot and Lieli 2013) and therefore solve the binary quantile regression. We also show that our asymmetric exponential loss is equivalent (up to scale) to the asymmetric least square loss (Newey and Powell 1987) for binary classification/prediction. We extend the result of Bartlett and Traskin (2007) and show that the Asymmetric AdaBoost algorithm is consistent in the sense that the risk of the classifier it produces approaches the Bayes Risk. Monte Carlo experiments show that Asymmetric AdaBoost performs well relative to the lasso-regularized high-dimensional logistic regression under various situations especially when p ≫ n and in the tails. We apply the Asymmetric AdaBoost to predict the US business cycle turning points and directions of stock price changes.

Prof. Tae-hwy Lee

Data Science

Asymmetric AdaBoost for High Dimensional Maximum Score Regression

Data Science Program