From the course: Building Recommender Systems with Machine Learning and AI

Install Anaconda, review course materials, and create movie recommendations - Python Tutorial

From the course: Building Recommender Systems with Machine Learning and AI

Install Anaconda, review course materials, and create movie recommendations

(upbeat music) - Welcome to Building recommender systems with Machine Learning and AI. You're going to learn the latest research from me, Frank Kane. I spent my career at amazon.com where I led the development of many of Amazon's recommendation features. In this course, I'll share the real-world problems you'll encounter when developing recommender systems, and the real world solutions I've learned over the years. Recommender systems are one of the most valuable applications of machine learning today. Amazon attributes over 20% of their revenue to recommendations, and companies like YouTube and Netflix rely on them entirely to allow their users to discover new content. At the end of this course, you'll have hands-on experience in applying a wide variety of Recommender System techniques to real-world user behavior data. You'll be confident when interviewing for positions at the hottest tech employers that depend on these systems, or you'll know how to apply these lucrative techniques for the employer you already have. We'll start off with an overview of how recommender systems are used, and then we'll create our own framework for testing different algorithms on a real world, a movie ratings data. Together, we'll develop examples of neighborhood-based collaborative filtering, content-based filtering, model based methods, and deep learning with artificial neural networks. You'll learn how to generate recommendations at massive scale on the cloud using TensorFlow, Amazon's DSSTNE, AWS SageMaker and Apache spark. We'll also look at in-depth case studies from YouTube and Netflix and learn from those giants as well. The ideal student for this course has some experience in programming, perfectly in the Python language. We're going to cover a lot of machine learning algorithms, so some background in Computer Science will be helpful in understanding how those algorithms work. As long as you've done some programming before you should be able to pick things up. I know you're itching to go hands on and produce some recommendations on your own, so let's dive right in and get all the software and data you need installed. In the next few minutes, you're going to install a Python development environment on your PC if you don't have one already, then you'll install a package for Python called Surprise, that makes developing recommender systems easy. Finally, we'll download the course materials, including some real movie rating data, and we'll make movie recommendations for a real person right here in lecture one. So let's do this. The first thing you need is some sort of scientific environment for Python that supports Python three. That means a Python environment that's made for data scientists like Anaconda or Enthought Canopy. If you already have one, then great, you can skip that step, but if not, let's get Anaconda installed on your system, and we'll also get the course materials you need while we're at it. Now, if you're the sort of person who prefers to just follow written instructions for things like this, you can head over to my website at the URL shown here, pull it up anyhow as we're going to refer to this page, as we set things up in this video. Remember to pay attention to capitalization, the R and the S in RecSys need to be capitalized. You'll also have a chance to join the Facebook group for this course where you can collaborate with fellow students, and you'll be offered a chance to stay in touch with me as well. In this course, we're going to use the Python programming language as it's pretty easy to pick up. So if you don't already have a development environment for Python three installed, you'll need to get one. I recommend Anaconda, it's free and widely used. Let's head over to www.anaconda.com/download and select the installer for whatever operating system you're using. For me, that's the windows 64 bit, and be sure to select the Python three version, not Python two. Once it downloads, we'll go through the installer, making sure to install it on a drive that has plenty of space available, at least three gigabytes. Now that Anaconda is installed, we can launch it, and select the environments tab here. To keep things clean, let's set up an environment just for this course. Click on create, and let's call it RecSys, that's shorthand for recommender systems, by the way. We want a Python environment, and we want for whatever current version of Python three is offered to you, it will take a few moments for that environment to be created. Next, we need to install a Python package that makes developing recommender systems easier called Surprise. To do that, click on the arrow next to the RecSys environment that you just made and open up a terminal from it. Now in the terminal, run Conda install-c Conda- forge, scikit-surprise. If prompted, hit why to continue and let it do its thing. When it's done, we can close this terminal window. Next we need to download the scripts and data used in this course. From our course setup page at sundog-education.com/RecSys, you'll find a link to the materials. Let's go ahead and download that. When it's done, we'll unzip it, and put it somewhere appropriate like your documents folder. Throughout this course, we're going to build up a large project that recommends movies in many different ways, so you're going to need data to work with. Back at our course setup page, you'll find a link to the MovieLens Dataset. It's a subset of a hundred thousand real movie ratings from real people, along with some information about the movies themselves. Download that and unzip it. When it's unzipped moved the resulting ml-latest-small folder inside the course materials folder that you made earlier. Now we have everything we need, let's make some movie recommendations. Back at Anaconda navigator, make sure the RecSys environment we created is still selected, and now click on the home icon. The code editor we're going to use is called Spyder. So under Spyder hit install, and let that finish. Once Spyder has done installing hit launch. Now open up the Getting Started folder inside your course materials, and open the gettingstarted.py script file. Take a quick look at the code. We'll come back to it later and walk through it all, but the interesting thing right now is that we're going to be applying a fairly advanced recommender algorithm called singular value decomposition or SVD for short, on 100,000 movie ratings with just about 60 lines of code. How cool is that? This isn't necessarily hard from a coding standpoint, hit the green play button, and that will kick off the script. It starts by choosing an arbitrary user, user number 85 in our case, who we're going to get to know very well, and summarizing the movies that he loved and the movies he hated. So you can get a sense of his tastes. He seems pretty typical. Someone who likes good action and SiFy movies, and hates movies that really missed the mark, like Super Mario Brothers. If you haven't seen that movie, it's pretty painful, especially if you're a Nintendo fan. Next, we run the SVD algorithm on our movie ratings and use the ratings from everyone else to try to find other movies user 85 might like that he hasn't already seen. It came up with some interesting stuff for movies you've heard of like star Wars episode four, that recommendation makes intuitive sense based on what we know about this user. But it's hard to tell if a movie you've never heard of is a good recommendation or not. Indeed you'll learn in this course that just defining what makes a good recommendation is a huge problem, that's really central to the field of recommender systems. And in many ways, building recommender systems is more of an art than a science. You're trying to get inside people's heads and build models of their preferences. It's a very hard problem, but also a very fun one to solve. Anyway, congratulations, you just ran your first movie recommendation system using some real movie rating data from real people. There's so much more for us to dive into here, so keep going. This is a really interesting field, and even after spending over 10 years on it myself, it just doesn't get old.

Contents