Traditional datasets are organized into rows and columns. In this video, learn how to work with the data in individual rows, including finding a specific record.
- [Instructor] We can filter rows…based on certain conditions,…so in PySpark we specify the DataFrame dot filter…and then we specify the condition…that we're looking to filter by.…In pandas it's very similar,…where you just specify the DataFrame dot column…within square brackets of the data frame.…The other very interesting use case is Unique Rows…and this is when we want to determine the…unique rules for a column.…So in PySpark we select the DataFrame…and then we tag the distinct function at the end that,…so it's df.select.…
The column names that you're looking…to determine the Unique Rows for…distinct and then you can display them…using the show command.…In pandas you would've used the unique function.…Now sorting is a very important function…and in PySpark we use orderBy.…In pandas you would've used the sort_values function…and you provided the column name.…Now since DataFrames are immutable…you can't just add to the DataFrame.…Instead what you have to do…is union the original DataFrame with a new one.…This concatenates two DataFrames…
- Benefits of the Apache Spark ecosystem
- Working with the DataFrame API
- Working with columns and rows
- Leveraging built-in Spark functions
- Creating your own functions in Spark
- Working with Resilient Distributed Datasets (RDDs)
Skill Level Intermediate
1. Introduction to Apache Spark
2. Technical Setup
3. Working with the DataFrame API
5. Resilient Distributed Datasets (RDDs)
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.