Saturday, February 19, 2005

Tivo Suggestions paper

I love TiVo. I can't imagine TV without it.

TiVo has a feature, TiVo Suggestions, that tries to recommend shows you might like. It's a great idea, helping viewers discover new shows they might not know about. Unfortunately, the quality is mediocre.

I've been curious why the recommendations are poor. And now I can find out. Kamal Ali and Wijnand van Stam from TiVo published a paper at KDD 2004, "Tivo: Making Show Recommendations Using a Distributed Collaborative Filtering Architecture" (MS Word doc file).

The paper says:
    TiVo uses an item-item (show to show) form of collaborative filtering which obviates the need to keep any persistent memory of each user's viewing preferences at the TiVo server .... It uses k-nearest neighbor with Pearson correlation to make show recommendations ... The collaborative filtering system is augmented on the client by a content-based Bayesian recommendation system to address the cold start problem for new users and shows.
TiVo uses a combination of a form of collaborative filtering and a Bayesian content analysis to make recommendations. The paper only describes the collaborative filtering-like analysis in any detail.

An early clue on the quality problems is here:
    We have not performed formal empirical evaluations of its accuracy .... The number one item on our agenda for future work is a thorough empirical evaluation of the quality of suggestions.
They haven't tried to measure the quality of the recommendations. This is not good. Personalization is hard. It is not a solved problem. You can't just throw a particular algorithm at the problem and assume it will work well. Experimentation is crucial. Without a quality measure, they have no way of finding improvements to their recommendations or even knowing if a change is an improvement.

Getting into what kind of improvements could be made, let's start by summarizing their current system. Each TiVo is essentially a low-end PC connected to TiVo's server farm with a slow and only intermittently available pipe (modem over phone line).

At a high level, TiVo Suggestions currently works by sending all the ratings across to their server cluster from each TiVo, generate correlations between the shows, and send back the list of correlations to each TiVo. Because of the slow pipe and the amount of data to be processed, only 1/16 of the correlations are computed each day, so the correlations take up to 16 days to update. Each client TiVo computes new recommendations "at least once a day" using the correlation data.

The algorithm they are currently using, "item-item collaborative filtering", is designed for real-time, high performance applications. Since the recommendations don't need to be computed in real-time, just "at least once a day", it's worth considering whether the recommendations should be computed entirely on the server cluster. If they aren't updating the recommendations in real-time, a real-time algorithm probably is not the best choice.

They compute only 1/16 of the correlations each day, apparently because of computational resource constraints on their cluster, but this creates a bad user experience. Instead, if they really are hardware constrained, sampling or other data reduction techniques could improve performance without reducing quality.

They say they haven't experimented yet with biasing toward more recent data, another way of reducing data and improving the apparently quality of the recommendations.

They are using Pearson correlation as their similarity metric. The choice of similarity metric can have a broad impact on recommendations, biasing the recommendations toward more surprising or unusual recommendations or pushing them into the mainstream.

They made a variety of decisions elsewhere that impact recommendation quality. They combine predictions using a weighted linear average. They have a variety of thresholds in their correlation computation (min-pair, min-single, etc.). They arbitrarily setting the confidence ranges for the content and collaborative filtering-based predictions. Experimentating with other choices here could find improvements.

It's a shame. TiVo is great, but more and more PVRs are coming on the market. Recommendations could be a key differentiator for TiVo.

If my TiVo knows me, knows what I like, and helps me find it, the experience on any other PVR would seem hollow, like I just lost a best friend. With the current quality of TiVo Suggestions, my TiVo feels more like a clueless stranger than a close friend.

[paper via Matt Haughey]

No comments: