From the course: Spark for Machine Learning & AI

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Collaborative filtering

Collaborative filtering - Apache Spark Tutorial

From the course: Spark for Machine Learning & AI

Start my 1-month free trial

Collaborative filtering

- [Instructor] Collaborative filtering follows the same patterns we've used repeatedly in this course. First we start with preprocessing. Now, we're going to use the alternating least squares method that's provided by Spark MLlib, and, to use that, we just import the ALS code from pyspark.ml.recommendation package. And then we build a DataFrame using user-item ratings. Now, when it comes to modeling, we create an ALS object and, when we do that, we have to specify the user, the item, and the rating columns in our data frames. And then we train the model using fit and fit is part of the ALS project. And then when it's time to evaluate, we create predictions using the transform of the ALS model and we apply that to our test data. We create a RegressionEvaluator object and we use the evaluate function of that RegressionEvaluator object to calculate the root mean squared error, and that'll give us a measure of how well our collaborative filtering is making recommendations.

Contents