Model Ensemble for Click Prediction in Bing Search Ads Xiaoliang Ling Weiwei Deng Chen Gu Microsoft Bing Microsoft Bing Microsoft Bing No. 5 Dan Ling Street No. 5 Dan Ling Street No. 5 Dan Ling Street Beijing, China Beijing, China Beijing, China
[email protected] [email protected] [email protected] ∗ Hucheng Zhou Cui Li Feng Sun Microsoft Research Microsoft Research Microsoft Bing No. 5 Dan Ling Street No. 5 Dan Ling Street No. 5 Dan Ling Street Beijing, China Beijing, China Beijing, China
[email protected] [email protected] [email protected] ABSTRACT Google [21], Facebook [14] and Yahoo! [3]. Recently, factoriza- Accurate estimation of the click-through rate (CTR) in sponsored tion machines (FMs) [24, 5, 18, 17], gradient boosting decision ads significantly impacts the user search experience and businesses’ trees (GBDTs) [25] and deep neural networks (DNNs) [29] have revenue, even 0.1% of accuracy improvement would yield greater also been evaluated and gradually adopted in industry. earnings in the hundreds of millions of dollars. CTR prediction is A single model would lead to suboptimal accuracy, and the above- generally formulated as a supervised classification problem. In this mentioned models all have various different advantages and dis- paper, we share our experience and learning on model ensemble de- advantages. They are usually ensembled together in an industry sign and our innovation. Specifically, we present 8 ensemble meth- setting (or even machine learning competition like Kaggle [15]) to ods and evaluate them on our production data. Boosting neural net- achieve better prediction accuracy. For instance, apps recommen- works with gradient boosting decision trees turns out to be the best.