Time Bidding Auctions
Total Page:16
File Type:pdf, Size:1020Kb
Bid Shading In First-Price Real- Time Bidding Auctions Tuomo Tilli Master’s Thesis Master of Engineering - Big Data Analytics Förnamn Efternamn 2019 MASTER’S THESIS Arcada Degree Programme: Master of Engineering - Big Data Analytics Identification number: 7253 Author: Tuomo Tilli Title: Bid Shading In First-Price Real-Time Bidding Auctions Supervisor (Arcada): Leonardo Espinosa Leal Commissioned by: ReadPeak Oy Abstract: Online advertisements can be bought through a mechanism called real-time bidding (RTB). In RTB the ads are auctioned in real time on every page load. The ad auctions can be second-price or first-price auctions. In second-price auctions the one with the highest bid wins the auction, but they only pay the amount of the second highest bid. In this paper we focus on first-price auctions, where the buyer pays the amount that they bid. The buyer should bid more than others to win the impression, but only as little amount more as possible and at maximum what they consider the impression to be worth. This research will evaluate how multi-armed bandit strategies will work in optimizing the bid size in ReadPeak’s first-price real-time bidding environments. ReadPeak is a demand-side platform (DSP) which buys inventory through ad exchanges. We analyze seven multi- armed bandit algorithms on offline data from the ReadPeak platform. Three algorithms are tested in ReadPeak’s production environment. We discover that the multi-armed bandit algorithms reduce the bidding costs considerably compared to the baseline. This has potential to bring significant savings for the advertiser. More research is required to get a decisive result on which algorithm performs the best in the production environment. Keywords: bid shading, bid optimization, multi-armed bandits, ReadPeak Number of pages: 53 Language: English Date of acceptance: CONTENTS 1 Introduction ............................................................................................................ 9 1.1 Background ................................................................................................................... 9 1.2 Motivation and Aim of the Study ................................................................................. 12 1.3 Data and Methods ....................................................................................................... 12 1.4 Definitions ................................................................................................................... 13 1.5 Structure of the Thesis ................................................................................................ 14 2 Related Work ........................................................................................................ 15 3 Research Methodology ........................................................................................ 19 3.1 Introduction ................................................................................................................. 19 3.2 Methods ...................................................................................................................... 19 3.2.1 The Epsilon-Greedy Algorithm ............................................................................ 21 3.2.2 The Upper Confidence Bound Algorithm (UCB) ................................................. 21 3.2.3 The Sliding Window UCB Algorithm .................................................................... 22 3.2.4 The Exponentially Decaying UCB Algorithm ....................................................... 23 3.2.5 The Exponentially Decaying Sliding Window UCB Algorithm ............................. 24 3.2.6 The Thompson Sampling Algorithm .................................................................... 24 3.2.7 The Dynamic Thompson Sampling Algorithm ..................................................... 25 3.3 Data ............................................................................................................................ 26 3.3.1 Ethical Considerations ........................................................................................ 28 3.4 Research Design ........................................................................................................ 28 4 Results .................................................................................................................. 29 4.1 Simulated Auctions ..................................................................................................... 29 4.1.1 The Epsilon-Greedy Algorithm ............................................................................ 30 4.1.2 The Sliding Window UCB Algorithm .................................................................... 32 4.1.3 The Exponentially Decaying UCB Algorithm ....................................................... 34 4.1.4 The Exponentially Decaying Sliding Window UCB Algorithm ............................. 36 4.1.5 The Thompson Sampling Algorithm .................................................................... 38 4.1.6 The Dynamic Thompson Sampling Algorithm ..................................................... 40 4.1.7 Comparison ......................................................................................................... 42 4.2 Run Times ................................................................................................................... 45 4.3 Production Results ...................................................................................................... 45 5 Conclusion ............................................................................................................ 48 5.1 Summary ..................................................................................................................... 48 5.2 Discussion ................................................................................................................... 48 5.3 Future Work ................................................................................................................ 49 References ................................................................................................................... 50 Figures Figure 1. An illustration of the online advertising environment. .................................... 11 Figure 2. Description of how the multi-armed bandit algorithms are used in generating the bid amount. ................................................................................................................ 20 Figure 3 The winning prices and the maximum bids in dataset 1. .................................. 27 Figure 4. The winning prices and the maximum bids in dataset 2. ................................. 28 Figure 5. The cumulative average cost per impression for the E-Greedy algorithms with different values of epsilon on dataset 1. .......................................................................... 31 Figure 6. The cumulative average cost per impression for the E-Greedy algorithms with different values of epsilon on dataset 2. .......................................................................... 31 Figure 7. The cumulative average cost per impression for the SW-UCB algorithms with different window sizes on dataset 1. ............................................................................... 33 Figure 8. The cumulative average cost per impression for the SW-UCB algorithms with different window sizes on dataset 2. ............................................................................... 33 Figure 9. The cumulative average cost per impression for the ED-UCB algorithms with different decay rates on dataset 1. ................................................................................... 35 Figure 10. The cumulative average cost per impression for the ED-UCB algorithms with different decay rates on dataset 2. ................................................................................... 35 Figure 11. The cumulative average cost per impression for the EDSW-UCB algorithms with decay rate of 0.99 and different window sizes on dataset 1. ................................... 37 Figure 12. The cumulative average cost per impression for the EDSW-UCB algorithms with decay rate of 0.99 and different window sizes on dataset 2. ................................... 37 Figure 13. Thompson Sampling cumulative average cost per impression for the different target win rates on dataset 1. ........................................................................................... 39 Figure 14. Thompson Sampling cumulative average cost per impression for the different target win rates on dataset 2. ........................................................................................... 40 Figure 15. The cumulative average cost per impression for the D-TS algorithms with target win rate of 0.9 and different threshold values on dataset 1. .................................. 41 Figure 16. The cumulative average cost per impression for the D-TS algorithms with target win rate of 0.9 and different threshold values on dataset 2. .................................. 42 Figure 17. The cumulative average cost per impression for the best versions of each algorithm on dataset 1. The EDSW-UCB algorithm overlaps quite closely the ED-UCB algorithm, which makes it hard to see the ED-UCB algorithm line. .............................. 43 Figure 18. The cumulative average cost per impression for the best versions of each algorithm on dataset 2. ...................................................................................................