From the course: Introduction to Spark SQL and DataFrames
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Working with NA values in DataFrames
From the course: Introduction to Spark SQL and DataFrames
Working with NA values in DataFrames
- [Instructor] It's not uncommon to have data missing from datasets, now when we work with Sequel, we're used to working with nulls and working around nulls. When we work with data frames, the absence of data is indicated by an NA So in this lesson, we're going to look at how we can work with NAs and nulls using data frames and Spark Sequel. So as I've done before, I've started with some data already loaded, so let's review what I have done already. In this Jupyter Notebook, I have a number of import statements, so I'm importing the Row function which we used in the previous video, to create a local data frame, and then I also have Spark Session which allows us to work with Spark Sequel Now I'm also importing a couple of things we haven't seen before, I'm importing a function called lit which allows us to create a literal column for a data frame, and I'm also importing a data type called String Type which we'll use a…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.