From the course: Introduction to Spark SQL and DataFrames

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Exploratory data analysis with DataFrames

Exploratory data analysis with DataFrames

From the course: Introduction to Spark SQL and DataFrames

Start my 1-month free trial

Exploratory data analysis with DataFrames

- [Instructor] Now that we've seen, how to work with Spark data frames, using the data frame API and Spark SQL we can now start to look at how we use those tools for some higher level tasks, like exploratory data analysis and machine learning. So in this video we'll look at how to use the data frame API for some basic exploratory data analysis with the utilization data we've looked at previously. So, at this point I have opened a new Jupyter Notebook and I have done the preliminary specifications and data loading, so the data has been loaded. The first thing I'd like to do is create a table which is accessible from SQL and to do that I'll specify our data frame called df util and I'll call the createOrReplaceTempView method to create a table which I'll call utilization, and let's just verify the count on df util. It should be 500,000 and it is so we're all set. Now the first thing I'd like to show you is an API command called…

Contents