Session S1 - Apprentissage et enchères

Organisation : Vianney Perchet

An introduction to data-driven auction theory (pdf)

Orateur : Noureddine EL KAROUI, UC Berkeley and Criteo Research

I will review classic auction formats and associated theory. I will focus in particular on sealed-bid first price and second price auctions, as well as the Myerson auction which is optimal from the seller standpoint. The talk will concentrate on the classic economic standpoint, though time permitting I may introduce recent results linking auction theory and learning theory.

Optimization of a SSP's Header Bidding Strategy using Thompson Sampling (pdf)

Orateur : Nicolas GRISLAIN, AlephD, Paris

Over the last decade, digital media (web or app publishers) generalized the use of real time ad auctions to sell their ad spaces. Multiple auction platforms, also called Supply-Side Platforms (SSP), were created. Because of this multiplicity, publishers started to create competition between SSPs. In this setting, there are two successive auctions: a second price auction in each SSP and a secondary, first price auction, called header bidding auction, between SSPs. In this paper, we consider an SSP competing with other SSPs for ad spaces. The SSP acts as an intermediary between an advertiser wanting to buy ad spaces and a web publisher wanting to sell its ad spaces, and needs to define a bidding strategy to be able to deliver to the advertisers as many ads as possible while spending as little as possible. The revenue optimization of this SSP can be written as a contextual bandit problem, where the context consists of the information available about the ad opportunity, such as properties of the internet user or of the ad placement. Using classical multi-armed bandit strategies (such as the original versions of UCB and EXP3) is inefficient in this setting and yields a low convergence speed, as the arms are very correlated. In this paper we design and experiment a version of the Thompson Sampling algorithm that easily takes this correlation into account. We combine this bayesian algorithm with a particle filter, which permits to handle non-stationarity by sequentially estimating the distribution of the highest bid to beat in order to win an auction. We apply this methodology on two real auction datasets, and show that it significantly outperforms more classical approaches. The strategy defined in this paper is being developed to be de- ployed on thousands of publishers worldwide. Joint work with G. Jauvion, P. Dkenge Sielenou, A. Garivier and S. Gerchinovitz

Dynamic Pricing with Finitely Many Unknown Valuations (pdf)

Orateur : Tommaso CESARI, Université degli Studi di Milano.

We consider revenue maximization in stochastic dynamic pricing where the distribution of buyers' private values is supported on an unknown set of points in $[0,1]$ of unknown cardinality $K$. This choice of setting is well-suited to model scenarios in which buyers are grouped in an unknown number of latent types, characterized by their private values for the good on sale. As a concrete example, consider the buyer's game in repeated first price-auctions. This is formally equivalent to the seller's game in dynamic pricing, where valuations correspond to the highest bids among other bidders. In online advertising, other bidders are Demand-Side Platforms that bid according to their own user segmentation. Then, their higher bids will be values corresponding to the union of these unknown segments. We show that no algorithm can achieve a regret significantly better than $Θ(√(KT))$, where $T$ is the time horizon, and we present an algorithm achieving regret $O(√(KT))$ up to a constant independent of the distribution of buyers' valuations. Our setting is strictly harder than $K$-armed stochastic bandits: we show that it is impossible to prove regret bounds of order $ T$ even when the gap $Δ$ between the revenue of the optimal valuation and that of the second-best valuation is constant. We then investigate distribution-dependent bounds that rely on additional information on the demand curve. We prove an upper bound of order $(1/Δ + ( T)/γ^2)K T$, where $γ$ is a known lower bound on the smallest probability of a valuation (i.e., the smallest drop in the demand curve). As $(K/Δ) T$ is the regret in $K$-armed stochastic bandits, this shows that the price for identifying each one of the $K$ valuations is at most $( T)( T)/γ^2$, which corresponds (up to $ $ factors) to the known upper bound on noisy binary search. This is a significant improvement on previously known regret bounds for discontinuous demand curves, that are at best of order $(K^12/γ^8) √(T)$. We also prove that when $K=2$, a regret bound of order $(T)/Δ$ (up to $$ factors) can be achieved with no additional information, which is the same bound that one would have if all the valuations were known in advance. Finally, we show a $O(√(T))$ upper bound on the regret for the setting in which the buyers' decisions are nonstochastic, and the regret is measured with respect to the best between two fixed valuations one of which is known to the seller.

Second price auctions are not truthful: an extensive study of optimal bidding strategies (pdf)

Orateur : Thomas NEDELEC, Criteo-ENS Paris Saclay

Truthful auctions play a key role in the economics of the Internet industry. While classical theory suggests "optimal" bidding strategies and associated optimal auctions for sellers, we show that these results become misleading when sellers learn from the buyers' data to optimize their auctions (e.g. computing monopoly price based on previous bids of the buyer). Using a functional analytic rather than game theoretic approach to our problems, we show this by proposing several explicit methods to improve the performance of a bidder in classical and widely used auctions such as second price auctions.