Join Jonathan Fernandes for an in-depth discussion in this video Solution, part of Apache PySpark by Example.
- [Instructor] Now, just to speed things up a little bit,…I'm going to drop the DataFrame rc into cache.…And, since the cache command,…or the cache function, is lazily evaluated,…I'm going to use an action function, such as count,…to get that DataFrame into cache as soon as possible.…So, cache…and rc.count.…Now, remember that running the count function has…got nothing to do with answering this question.…So let's take a look at our DataFrame, so rc.show,…and first five rows.…
Now, I have no idea what formats…the flag non-criminal looks like,…but the best place to look is…probably the Primary Type column.…Now, the first five rows don't provide any clues,…so let's get all of the unique rows for that column.…What I'm going to do first is to…get a count of the unique rows,…and then I'll know how many rows I should be showing.…So, rc….select…(col…Primary Type…distinct…and do a count.…
So I know that there're going to be 35 different, unique rows.…Now, I also don't know whether non-criminal…is going to have multiple spellings.…
- Benefits of the Apache Spark ecosystem
- Working with the DataFrame API
- Working with columns and rows
- Leveraging built-in Spark functions
- Creating your own functions in Spark
- Working with Resilient Distributed Datasets (RDDs)
Skill Level Intermediate
1. Introduction to Apache Spark
2. Technical Setup
3. Working with the DataFrame API
5. Resilient Distributed Datasets (RDDs)
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.