Statistics/Data Mining Applications: Value of Learning

Goal:

1. To solve the problem of efficiently allocating the sponsored links and simultaneously estimating the CTR.

2. The task is to strike a balance between exploring, i.e., showing a link to get a better estimate of its CTR, and exploiting, i.e., showing links that have the best performance, according to our current estimates of the CTRs.

3. In the mechanism, an advertiser (buyer) has an incentive to increase his bid by some amount, which we call the value of learning.

Model and Notations:

1. We consider a setting where n bidders (or candidate sponsored links) are competing to be placed in one of the m slots.

2. Assume a separable model for CTR, i.e., if the link of advertiser i is displayed in position j, it will be clicked on with probability γ_jλ_i. Here γ_jis associated with each slot j=1,…,m (called the position bias of slot j), and λ_i is determined by each advertiser (called the clickability of this advertiser).

3. Assume that the slots are numbered in decreasing order of their position bias, i.e., γ₁≥γ₂≥…γ_m. The exact clickability of advertisers are not known, and the estimate of the clickability of advertiser i is denoted by λ_i^estimate.

4. b_i: the bid of the i-th buyer

5. p_i: the price of the i-th buyer pays

6. v_i: the value of the i-th buyer per click

Randomization Step:

1. In a repeated second-price auction without exploration, when there is uncertainty about the clickability of the advertiser, it is no longer the buyer’s interest to bid his value per click. Specifically, the buyer has the incentive to increase his bid in order to induce the mechanism to explore him.

2. Randomization: assume the clickability λ_iis distributed according to a prior distribution D_λ. Each time before the ranking begins, the i-th advertiser’s clickability λ_iis picked from the prior distribution. It gives the buyers who have low clickability (which may be a result of insufficient traffic) a chance to prove themselves.

Note: Different buyers may have different prior distribution D_λ

Value of Learning:

1. How to estimate the clickability of a buyer: λ_i^estimate=c_i/ec_i, where c_iis total clicks the i-th buyer has receives (for all positions), and ec_i (cumulative expected clicks) is the number of clicks we would expect this buyer to receive, if he had a clickability of 1.

2. Define the value of learning for this buyer as θ_i^estimateb_i, where

θ_i^estimate=C*sqrt(λ_i^estimate/ec_i) for a constant C.

3. The mechanism sorts the advertisers based on their (λ_i^estimate+θ_i^estimate)b_i, allocates the j-th slot to the i-th buyer in this order (which we call buyer i), and charges him (λ_i+1^estimate+θ_i+1^estimate)b_i+1/(λ_i^estimate+θ_i^estimate) for each click.

Properties:

1. The mechanism can lead to a higher revenue if the short term.

2. The mechanism can lead to higher efficiency.

3. The main assumption is that ordering of the buyers in decreasing order of (λ_i^estimate+θ_i^estimate)v_iis the same as their ordering in decreasing order of λ_i^estimatev_i.Sinceθ_i’sare typically small and higher slots get more clicks, this assumption is often true.

Reference:

[1] Li, S., Mahdian, M., & McAfee, P. (2010). Value of learning in sponsored search auctions. In Proceedings of the 6th international workshop on internet and network economics (pp. 294–305).

Statistics/Data Mining Applications

Sunday, March 18, 2012

Value of Learning

No comments:

Post a Comment