Join Dan Sullivan for an in-depth discussion in this video Basic machine learning with DataFrames, part 1, part of Introduction to Spark SQL and DataFrames.
- [Instructor] A commonly used technique … in exploratory data analysis is called clustering. … And here the idea is that we want to see … if there are natural groupings among the data. … So for example, let's take a look at the utilization data. … Let's see if we can divide that data set … into three groups that logically come together. … So to do that, we're going to use, … of course, we're going to use our utilization data. … And we'll be using dataframes. … We're also going to use some code … from the machine learning package. … So the first thing I did before loading the data … was I imported of course, our pyspark SQL … so we can have our Spark sessions. … I also imported three libraries from the ml package. … Vectors, vectors assembler and kmeans. … And I'll explain each of those as we go through. … And then I went through our usual steps … to upload our utilization data from a JSON file … into a dataframe called df_util. … So let's take a quick look at df_util, … just so we're familiar with what the data looks like. …
- Installing Spark and PySpark
- Setting up a Jupyter notebook
- Loading data into DataFrames
- Filtering, aggregating, and saving data
- Querying and modifying DataFrames with SQL
- Exploratory data analysis
- Basic machine learning
Skill Level Intermediate
1. Introduction to Spark DataFrames
2. Installing Spark
3. Getting Started with Spark DataFrames
4. SQL for DataFrames
5. Data Analysis with Spark
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.