Wednesday, June 25, 2008

Advertising auctions and modeling externalities

Googlers Gagan Aggarwal, Jon Feldman, S. Muthukrishnan, and Martin Pal have an upcoming paper, "Sponsored Search Auctions with Markovian Users", with a new model of people's behavior that captures the idea that some ads may cause people to stop looking at the other ads.

An excerpt:
[Prior] models assume that the probability of an ad getting clicked is independent of other ads that appear with it on the same page, an assumption made without much justification. It is hard to imagine that seeing an ad, perhaps followed by a click, has no effect on the subsequent behavior of the user.

We propose a model based on a user who starts to scan the list of ads from the top, and makes decisions (about whether to click, continue scanning, or give up altogether) based on what he sees.

More specifically, we model the user as the following Markov process: "Begin scanning the ads from the top down. When position j is reached, click on the ad i with probability pi. Continue scanning with probability qi."

It turns out that the structure of this [auction] is different than that of [generalized second price] ... The presence of the qi's requires a delicate tradeoff between the click probability of an ad and its effect on the slots below it.
I have been bothered for some time by the assumption that crappy ads have no impact on the ads around them. It seems likely that bad ads make things worse for everyone and should be penalized beyond the higher bids they have to pay for their low clickthrough rates.

Please see also Craswell et al., "An Experimental Comparison of Click-Position Bias Models" (PDF), a WSDM 2008 paper that proposes a similar "cascade model" not for ads, but for search results.

Please see also my earlier post, "Hal Varian on advertising auctions", which talks about an Ad Quality Score that Google uses and how it may be an attempt to patch a problem in current advertising auction models. While I doubt the Ad Quality Score is trying to exactly produce the model this paper advocates, the two efforts may be targeting the same problem.

Saturday, June 21, 2008

Getting things smart

I was not going to post about this, but I cannot seem to get Steve Yegge's post, "Done, and Get Things Smart", out of my head. It is clever piece on hiring that challenges the conventional wisdom.

An excerpt:
Smart and Gets Things Done is a good weeder function to filter out some of the common "epic fail" types.

But realize that this approach has a cost: it will also filter out some people who are just as good as you, if not better, or even way better, along dimensions that are entirely invisible to you.

So there's this related interviewing/hiring heuristic that I think may better approximate the kinds of people you really want to hire: Done, and Gets Things Smart.

You don't want someone who's "smart". You're not looking for "eager to learn", "picks things up quickly", "proven track record of ramping up fast".

No! Screw that. You want someone who's superhumanly godlike. Someone who can teach you a bunch of stuff. Someone you admire and wish you could emulate, not someone who you think will admire and emulate you.

You want someone who, when you give them a project to research, will come in on Monday and say: "I'm Done, and by the way I improved the existing infrastructure while I was at it."

Working with them directly ... you'll see that virtually every problem space has a ... component that you were blissfully unaware of until Done, and Gets Things Smart gal points it out to you and says, "There's an infinitely smarter approach, which by the way I implemented over the weekend."

These people aren't just pure gold; they're golden-egg-laying geese ... They're your seed engineers: the ones who will make or break your company with both their initial technical output and the engineering-culture decisions they put into place.
Smart people who can do stuff is one thing. But, getting people who constantly push everyone to learn and improve, who help build the culture, who make people do more than they ever thought possible, there lies the gold.

There is one spot where I might disagree with Steve, assuming I am at all qualified to do so. Steve implies that "Done, and Get Things Smart" people are born that way. Rather, I think they learn from other "Done, and Get Things Smart" people.

At Amazon.com, for example, the seed engineers Steve mentioned had an enormous influence on each other and those around them, pushing everyone to be better. People who already had some "Done, and Get Things Smart" tendencies were pushed further. Others learned from and sought to emulate the masters.

"Done, and Get Things Smart" is made. We can all strive to achieve it. Even if we fail, we will all be better for the attempt.

Saturday, June 14, 2008

Hal Varian on advertising auctions

Google Chief Economist Hal Varian has a post on the Official Google Blog on "How auctions set ad prices".

What is particularly interesting about the post is where the description differs from the theory. For most Web advertising auctions, advertisers are ordered by (bid * CTR), where CTR is the clickthrough rate on the advertisement. The net effect is that people end up paying per impression, for the space used on each page, which is what publishers generally want.

However, Hal spends quite a bit of time talking about Google's use of "Ad Quality Score". He indicates that, in addition to CTR, Google uses the "quality of [the] landing page" and the "relevance of the ads and keywords in the ad group to the site" as well as "other relevance factors", but details are not clear.

On the one hand, some kind of ad quality score makes a lot of sense. The CTR for a given advertisement shown in a given context to a given user can only be estimated. So, the use of "Ad Quality Scores" could be viewed as an attempt to get a more accurate estimate of the CTR.

On the other hand, this kind of looks like something tossed on top of CTR to penalize spammers or other lower quality advertisers. If that is the case, it becomes less clear what this change could do to the efficiency of the auctions.

Either way, whether it fits into a framework of getting better CTR estimates or it is a patch to fix a market failure in current advertising auctions, Hal's post is something to think about further.

Stages of Web 2.0 startups

Stacey Higginbotham at GigaOM has an amusing post, "The 5 Stages of a Consumer Web Startup".

Some excerpts:
One day an entrepreneur ... gets an idea ... and starts coding. A few weeks or possibly days, a beta -- increasingly a euphemism for a not-fully-thought-out-product -- emerges.

The buzz builds ... the entrepreneur rejoices. The VCs ... do a fly-by ... Eight weeks later reality sets in. The traffic stops growing or -- worse yet -- dives ... But as an ever-optimistic entrepreneur it's time to regroup, gather your programmers, toss back some Red Bull and ... launch a social network widget.

[Then] it's time for the big guns ... the open API. Now you're a platform! The startup ... founder rejoices again ... The money men get serious because ... [now] you have a Facebook strategy.

[Twelve] months later ... it's time for advertising .... But selling online advertising is hard ... It's time to consider reality. You could always try your hand as an ad network or merge with a competitor, but more than likely it's time to sell that domain name and user base on eBay or quietly shut your doors. Better luck next time.
I am embarrassed to say, Findory had a beta, widgets, an open API, and a failed attempt in online advertising. No Red Bull though.

Please see also the Underpants Gnomes' business plan.

Jeff Dean on Google infrastructure

Google Fellow Jeff Dean gave a talk at Google I/O called "Underneath the Covers at Google: Current Systems and Future Directions". Slides (PDF) also are available.

I was going to post some detailed notes on the talk, but James Hamilton's excellent post on the talk already covers most of what I was going to say.

Adding to James' thoughts, let me emphasize two parts of the slides that, even if you have seen this stuff many times before, definitely are worth a peek.

First, Jeff's descriptions of real failures they encountered on slide 12 are excellent. Note that randomly distributing replicas is not enough; you have to make sure all your replicas never are located in the same rack.

Second, slide 37 is on "Future Infrastructure Directions" for Google. Jeff emphasizes the fascinating problem of automated movement and replication of data and code in response to load across clusters and data centers. Very hard but very fun optimization problem there.

All the other Google I/O talks are also online if you are interested.

[Thanks, Dragos, for the pointer to the Google I/O talks.]

Tuesday, June 10, 2008

Sample programs in DryadLINQ

A new technical report out of Microsoft Research, "Some sample programs written in DryadLINQ" (PDF), shows off some examples of large scale distributed computations possible with Dryad.

The paper provides code for the EM and PCA algorithms, computing PageRank, and mining astronomical data, among many other things. There are also fairly detailed descriptions of how the computations are executed across the cluster.

Dryad is a programming infrastructure designed to supports large scale computations over clusters. It currently only is available inside of Microsoft, but it "is now widely used internally by Microsoft product groups" [1].

Please see also my earlier post, "Yahoo, Hadoop, and Pig Latin".

The value of fanatical customer service

Mike Masnick at TechDirt has an insightful post about how businesses should treat customer service as the face of the company, not as a cost center, using Zappos as an example.

An excerpt:
E-commerce for shoes [seems] exceptionally difficult ... Zappos overcame all of the concerns ... [using] an almost maniacal focus on customer service.

[They treat] customer service not as a "cost center," like almost all companies these days, but as an integral part of making happy, committed customers who also act as evangelists.

In order to do that, you need to have a loyal, committed customer service staff as well -- and Zappos has done some unique things there that are worth understanding. It doesn't do many of the typical call center things: no scripts, no time limits on calls and no limits on what the customer service reps can do to make customers happy.

Zappos also offers to pay each new employee $1,000 to quit, one month after they've joined .... Apparently about 10% of folks take the money and scram... [but] the long term benefits of having a more strongly committed staff cannot be overstated.
Especially for companies with no storefront, customer service is the face of the company. Shouldn't that face be as lovely as possible?

Please see also some of the past posts ([1] [2] [3] [4]) on TechDirt about the value of customer service.