Sunday, March 18, 2012

Value of Learning

    Goal:
1. To solve the problem of efficiently allocating the sponsored links and simultaneously estimating the CTR. 
2. The task is to strike a balance between exploring, i.e., showing a link to get a better estimate of its CTR, and exploiting, i.e., showing links that have the best performance, according to our current estimates of the CTRs. 
3. In the mechanism, an advertiser (buyer) has an incentive to increase his bid by some amount, which we call the value of learning.

Model and Notations:
1. We consider a setting where n bidders (or candidate sponsored links) are competing to be placed in one of the m slots.
2. Assume a separable model for CTR, i.e., if the link of advertiser i is displayed in position j, it will be clicked on with probability γjλi. Here γj is associated with each slot j=1,…,m (called the position bias of slot j), and λi is determined by each advertiser (called the clickability of this advertiser).
3. Assume that the slots are numbered in decreasing order of their position bias, i.e., γ1≥γ2≥…γm. The exact clickability of advertisers are not known, and the estimate of the clickability of advertiser i is denoted by λiestimate.
4. bi: the bid of the i-th buyer
5. pi: the price of the i-th buyer pays
6. vi: the value of the i-th buyer per click

    Randomization Step:
1. In a repeated second-price auction without exploration, when there is uncertainty about the clickability of the advertiser, it is no longer the buyer’s interest to bid his value per click. Specifically, the buyer has the incentive to increase his bid in order to induce the mechanism to explore him.
2. Randomization: assume the clickability λi is distributed according to a prior distribution Dλ. Each time before the ranking begins, the i-th advertiser’s clickability λi is picked from the prior distribution. It gives the buyers who have low clickability (which may be a result of insufficient traffic) a chance to prove themselves.
Note: Different buyers may have different prior distribution Dλ

  Value of Learning:
1. How to estimate the clickability of a buyer: λiestimate=ci/eci, where ci is total clicks the i-th buyer has receives (for all positions), and eci (cumulative expected clicks) is the number of clicks we would expect this buyer to receive, if he had a clickability of 1.
    2. Define the value of learning for this buyer as θiestimatebi, where
         θiestimate=C*sqrt(λiestimate/eci) for a constant C.
3. The mechanism sorts the advertisers based on their (λiestimateiestimate)bi, allocates the j-th slot to the i-th buyer in this order (which we call buyer i), and charges him (λi+1estimatei+1estimate)bi+1/(λiestimateiestimate) for each click.

   Properties:
1. The mechanism can lead to a higher revenue if the short term.
2. The mechanism can lead to higher efficiency.
3. The main assumption is that ordering of the buyers in decreasing order of (λiestimateiestimate)vi is the same as their ordering in decreasing order of λiestimatevi. Since θi’s are typically small and higher slots get more clicks, this assumption is often true.

 Reference:
[1] Li, S., Mahdian, M., & McAfee, P. (2010). Value of learning in sponsored search auctions. In Proceedings of the 6th international workshop on internet and network economics (pp. 294–305).



Wednesday, March 7, 2012

SQL Query: traffic referred from the same resource

CREATE temp TABLE VISITKEY_LOOKUP AS
SELECT VISIT_KEY
FROM TABLE_PAGE_VIEWS
WHERE SITE_KEY = 1
and day_key >= 20080501
AND REFERRER like '%google%'
AND VISITORIZED='true'
group by 1;



--PV
SELECT DAY_KEY || SUBSTRING(Event_DTM from 11 for 3) as DAY_HOUR, count(*) PAGEVIEW
FROM TABLE_PAGE_VIEWS PV
INNER JOIN VISITKEY_LOOKUP
ON PV.VISIT_KEY=VISITKEY_LOOKUP.VISIT_KEY
WHERE SITE_KEY = 1
and day_key >= 20080501
group by 1
order by 1;

Saturday, March 3, 2012

Reserve Prices in Internet Advertising Auctions

Suppose the reserve price is set for each content page (or for each seller, or for each category) at the time point (denoted by t) when a user lands on this page and the auction for showing recommended content pages begins.
1.     At this time point t, compute the average number of buyers bidding on this content page (or this seller, or this category), the average bid, and the average standard deviation of the bids, where the average was taken over all biddings. That is, every time the content page is viewed and the auction is conducted, the three statistics were computed, and then the average over all biddings for this given content page was take. The bid of the highest bidder in every auction was excluded from the statistics, because the theory does not allow us to pin it down (every bid of the highest bidder above a certain value results in the same payoffs). It is assumed that bidders’ values are drawn from a lognormal distribution, and its mean and standard deviation can be estimated.
Example: before this auction begins, we have bidding historical data by buyers.
Number of bidders
Mean(bid)
excluding the highest bid
Sd(bid)
Excluding the highest bid
3
$0.15
0.02
5
$0.17
0.04
4
$0.14
0.03
6
$0.20
0.05
Average of the above historical bidding data
4.3
$0.16
0.04

2.     These two statistics can be achieved in step 1. At time point t when the auction begins, we may observe the number of bidders as well as their bids. Denote the number of bidders for this auction as k. Simulate three moments (observed number of bidders (which is equal to k for this auction), average bid in positions 2 and below, and the standard deviation of the bids in positions 2 and below) for various true values of the mean and standard deviation of the lognormal distribution of values. To do that, 1000 draws of the vectors of bidder values (with k bidders) were drawn from the lognormal distribution estimated in step 1. Note that the number of bidders is irrelevant for setting the optimal reserve price, but it needs to be estimated in order to get an accurate estimate of the mean and the standard deviation of the distribution of values.
Example: suppose k=4, then simulate 1000 draws from the estimated lognormal distribution with mean $0.16 and standard deviation 0.04 made up from example in step 1.
Number of bidders
bids
Mean(bid)
excluding the highest bid
Sd(bid)
Excluding the highest bid
4
$0.25, $0.18, $0.16,$0.14
$0.16
0.02
4
$0.22, $0.16, $0.13,$0.1
$0.13
0.03
4
$0.18, $0.15, $0.12,$0.09
$0.12
0.03
                     Average of the above simulated bidding data
4

$0.14
0.04

3.     As long as the lognormal normal distribution is estimated for the case of k bidders, the theoretically optimal reserve price r is computed by solving the equation
                     r-[1-F(r)]/f(r)=0,                                              (1)
where F is the cumulative distribution function and f is the probability density function estimated at step 2 for the case of k bidders.
Example: as in the made-up example in step 2, for the case of k=4 bidders, the corresponding lognormal distribution has mean $0.14 and standard deviation 0.04. Then the reserve price can be calculated by solving equation (1) for the current auction.

4.     The above calculated reserve price is based on uniform CTRs for each buyer, and it needs to be converted into buyer-specific reserve prices that reflected the quality scores of the buyers: buyers with higher CTR face lower per-click reserves, and vice versa.
      In the current auction at time point t, suppose the k bidders’ CTRs are CTR_1, CTR_2,…, CTR_k, and denote their average as CTR_average. Then the i-th buyer’s reserve price is r_i=r*CTR_average/CTR_i.

Disadvantages: for some auctions, the reserve price may push out some potential qualified bidders so that the spots may not be fully filled and thus results in a revenue loss.


Reference:
[1] Ostrovsky, M., Schwarz, M., 2010. Reserve Prices in Internet Advertising Auctions: A Field Experiment.