From the course: Building Recommender Systems with Machine Learning and AI

Case study: Netflix, part 1

- [Narrator] Netflix has also published how they're producing recommendations, so let's dive into that next. They're not quite as open as Google is, so the details of how their algorithms work aren't available unless you work there, but they have shared a lot of higher-level lessons they've learned and the general approach they take. Our main source for this section comes from the book Recommender Systems Handbook, specifically Chapter 11, which is called Recommender Systems and Industry, a Netflix Case Study. This paper dates from 2015, so I'm sure their approaches have evolved since then, perhaps to include deep learning, but we don't know for sure. More recently, they have also presented at industry conferences, such as ACM's RecSys Conference, but the actual papers published are very high-level in nature. Unlike YouTube, Netflix has no mandate, at least at the time of their writing, to use deep learning for every machine problem they encounter. Instead, they bank on a hybrid approach that combines together the results of many different algorithms. The ones they've said they used in the past include RBM and SVD++, which they learned were worthwhile from the Netflix Prize. They've also said that they combine nearest neighbor approaches with matrix factorization approaches to try and get the best of both worlds. That's all they've said definitively, although they sort of gives us a wink and a nod when they present an incomplete list of methods that are useful to know in their paper. Their list includes: linear regression, logistic regression, elastic nets, SVD, matrix factorization, RBMs, Markov chains, latent Dirichlet allocation, association rules, factorization machines, random forests, gradient-boosted decision trees, k-means, affinity propagation, and Dirichlet processes. I think this is Netflix's way of saying they basically try every algorithm out there and let them all fight it out amongst each other when faced with the problem of generating recommendations for a given user. It's probably safe to say they're dabbling in deep learning by now as well. So while Netflix seems intent on keeping the details of its candidate generation as a trade secret, they do share some higher-level learnings and details on their ranking approach. One phrase that Netflix uses repeatedly in their publications is everything is a recommendation. The Netflix homepage is organized into many rows, each row containing its own top end list of recommendations for a given type of recommendation. For example, my homepage includes Top Picks, which presumably are my actual top end results for me, personally. Continue Watching is recommendations based on videos I've already watched, or at least the people in my family who share my account. I also see a list of recommendations restricted to the genre I seem to be most interested in, which is TV Sci-Fi and Fantasy. If I scroll down, there are countless more rows associated with recommendations for other categories I may be interested in, recommendations for new releases, for popular items, and for Netflix original series. Basically, everything is a recommendation. Their homepage is just a series of recommender engines tuned for some specific purpose. Netflix has gone all in with relying on recommender engines to introduce their members to new content. Having all these different recommenders on one page introduces its own set of challenges. How do you make sure you don't repeat the same recommendation on the same page? How do you choose the best order to present these different recommendations in? This means that Netflix not only has to personalize movie and TV recommendations to you, they also have to personalize the order in which these recommendations are presented to you. This is called whole-page optimization, and using machine learning to optimize the selection of individual rating widgets on slots on a page is a technique that can be powerful for any website. If you look at Amazon.com's homepage, it has a similar structure, and they face the same problem of page optimization to put the right features in the right slots on that page for you.

Contents