"A Dynamic Bidding Strategy Based on Model-Free Reinforcement Learning in Display Advertising"

Real-time bidding (RTB) is one of the most striking advances in online advertising, where the websites can sell each ad impression through a public auction, and the advertisers can participate in bidding the impression based on its estimated value. In RTB, the bidding strategy is an essential component for advertisers to maximize their revenues (e.g., clicks and conversions). However, most existing bidding strategies may not work well when the RTB environment changes dramatically between the historical and the new ad delivery periods since they regard the bidding decision as $\boldsymbol {a}$ static optimization problem and derive the bidding function only based on historical data. Thus, the latest research suggests using the reinforcement learning (RL) framework to learn the optimal bidding strategy suitable for the highly dynamic RTB environment. In this paper, we focus on using model-free reinforcement learning to optimize the bidding strategy. Specifically, we divide an ad delivery period into several time slots. The bidding agent decides each impression’s bidding price depending on its estimated value and the bidding factor of its arriving time slot. Therefore, the bidding strategy is simplified to solve each time slot’s optimal bidding factor, which can adapt dynamically to the RTB environment. We exploit the Twin Delayed Deep Deterministic policy gradient (TD3) algorithm to learn each time slot’s optimal bidding factor. Finally, the empirical study on a public dataset demonstrates the superior performance and high efficiency of the proposed bidding strategy compared with other state-of-the-art baselines.

Keywords: bidding strategy; time; reinforcement learning; bidding

Journal Title: IEEE Access
Year Published: 2020

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended