Join Dan Sullivan for an in-depth discussion in this video Aggregating Data with SQL, part of Introduction to Spark SQL and DataFrames.
- [Instructor] When we work with Sequel in databases, … we often use Sequel to perform aggregations … and the same holds true when working with Sequel in Spark. … So once again, I've started a new Jupyter Notebook, … and I've loaded data from our Utilization file, … and that utilization includes CPU utilization, … free memory, and session count, those are the measures, … and we organize those by time and by server ID. … So because I want to work with Sequel, … the first thing I'm going to do is specify … the name of the data frame that has our data … and then apply the create or replace … temp view and we'll call it Utilization, … and let's do a very simple aggregation, … let's get a count of the number of rows … in the utilization table and we'll put that into … a data frame called DF_Count … and we'll execute a Spark Sequel statement … and that statement is simply going to be … Select Count star from utilization … and let's show the results. … OK, so we have 500,000 rows, now let's make it …
- Installing Spark and PySpark
- Setting up a Jupyter notebook
- Loading data into DataFrames
- Filtering, aggregating, and saving data
- Querying and modifying DataFrames with SQL
- Exploratory data analysis
- Basic machine learning
Skill Level Intermediate
1. Introduction to Spark DataFrames
2. Installing Spark
3. Getting Started with Spark DataFrames
4. SQL for DataFrames
5. Data Analysis with Spark
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.