From the course: Spark for Machine Learning & AI

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Decision tree regression

Decision tree regression - Apache Spark Tutorial

From the course: Spark for Machine Learning & AI

Start my 1-month free trial

Decision tree regression

- [Instructor] Now we're going to consider decision tree regression. Regression is lot like classification, in the sense that we have a number of different algorithms we can use to perform regression and sometimes it helps to experiment with different algorithms to see which works best with your data set. So in this video, we're going to work with decision tree regression. So, the first thing I'm going to do is start Pyspark, and I'll import code we'll need for working with decisions trees and evaluating decision trees. We'll also import an evaluator that's specifically designed to work with regression data. And then we'll also import the vector assembler. Now the next thing we'll do is read in our data file. And I'm going to put this into power plant data frame or pp underscore df for short. And I'll reference the spark context and I'll read a csv. And the file I'm going to read is in my home directory. It's called power plant dot csv. And it contains a header row, and I want it to…

Contents