Monday, March 9, 2015

Recommendation System : A success story from Netflix

Netflix CineMatch
Netflix is the world's largest online DVD rental service. They offer flat rate DVD-by-mail to customers throughout the United States. A major function of the site is the CineMatch system. This system makes recommendations to Netflix subscribers based on their viewing habits and helps to determine which movies customers are likely to enjoy. According to Netflix 60% of all subscribers add these suggested movies to their queues. Their recommendations even come from the entire Netflix library(movie library), not just the new releases or the main stream films.

CineMatch is actually an Oracle database that organizes the Netflix library into clusters of similar movies and then analyses how customers have rated them. Those who have given similar ratings to the same movies in a cluster are then matched as like-minded viewers. CineMatch looks at the clusters you've rented from in the past, determines which titles you have yet to rent, and recommends only those films that have been highly rated by matched viewers.

CineMatch database looks at the following sources to determining recommendations :
  1. The films themselves, which are arranged as group of common movies
  2. Customer's ratings, rented movies and current queue. 
  3. Combined ratings of all Netflix users.
By using a system like this, the customers can be recommended lesser known movies that they may have never even heard of, that they would enjoy very much.

Netflix Prize
Recognizing that there's always a better way to do something Netflix launched a contest in 2006 to find an algorithm that beat CineMatch. The contest called the Netflix Price, promised $1 million to the first person or team to meet the accuracy goals for recommending movies based on user's personal preferences. Netflix also released the test data for algorithm developers to follow that contains : 100 million movie ratings, ranking from 1 to 5 stars, from anonymous users. Three years later, the $1 million prize was awarded to BellKor's Pragmatic Chaos, a seven member team which included two AT & T researchers. BellKor's Pragmatic Chaos submitted its winning algorithm only 24 minutes before another team, The Ensemble. Each of these algorithm submissions demonstrated 10% improvement over CineMatch.

The recommendations system updates itself constantly, making thousands of recommendations every second based on more than 5 billion movie ratings. Netflix reports that the average Netflex user has rated about 200 movies, and new ratings come in at about 4 million per day. About 60 percent of Netflix subscribers select movies based on these recommendations. You can find these in the "Suggestions for You" section on the site, and you can refresh the suggestions as you rate more movies.

Making good movie recommendations may seem like something that would require instinct or emotion. For example, if you recommend a movie you've seen to a friend, you take into account how the movie made you feel, your tastes and your friend's tastes. Netflix recommendations, on the other hand, are all math. Netflix matches your viewing and rating history with people who have similar histories. It uses those similar profiles to predict which movies you are likely to enjoy. That's what these recommendations really are -- predictions of which movies you will like.

These predictions rely on algorithms and statistics. It starts by matching movies to each other rather than matching people to movies, since there are far fewer titles in the library than there are Netflix subscribers. To make matches, a computer:
  1. Searches the CineMatch database for people who have rated the same movie - for example, "The Return of the Jedi"
  2. Determines which of those people have also rated a second movie, such as "The Matrix"
  3. Calculates the statistical likelihood that people who liked "Return of the Jedi" will also like "The Matrix"
  4. Continues this process to establish a pattern of correlations between subscribers' ratings of many different films
Often, these predictions make logical sense. A Netflix customer who gives two movies in the "Lord of the Rings" trilogy five stars is likely to enjoy the third film as well. However, Netflix users who spend a lot of time rating their movies and looking at their recommendations may find some surprising correlations. This is because the algorithms that keep the recommendation system running don't necessarily have anything to do with the plot or cast. Instead, they have to do with other subscribers' rental and ratings histories.

References