Learn how to build a micro Spark application using the Python API that highlights some of the key features of working with text files.
- [Instructor] Now let's take a look at…actually working with some data.…In this case, we're going to look at text files,…so the different types of text data.…First we should consider totally unstructured data.…And when we think about text based data,…we should consider a schema, which is simply…a definition of how to work with the data.…For totally unstructured data,…what we're talking about are text files,…e mails, some types of logfiles, and often…just plain text without any schema definition.…We can call these schema-never files,…there never really is a schema applied.…
Moving to the right a little bit,…on our schema spectrum here,…we have this semi-structured data.…This is data which is text based,…and tagged or structured in a loose way.…On the Web today, this type of data…is often represented by JSON files.…These files are just text, but do have a shape to them,…however this shape can be dynamic and change without notice.…We can think of these data as having a schema…after we read it from our disk,…so we would call these schema-later files.…
- Understanding Spark
- Reviewing Spark components
- Where Spark shines
- Understanding data interfaces
- Working with text files
- Loading CSV data into DataFrames
- Using Spark SQL to analyze data
- Running machine learning algorithms using MLib
- Querying streaming data
- Connecting BI tools to Spark
Skill Level Intermediate
1. Introducing Apache Spark
2. Analyzing Data in Spark
3. Using Spark SQL to Analyze Data
4. Running Machine Learning Algorithms Using MLlib
5. Real-Time Data Analysis with Spark Streaming
6. Connecting BI Tools to Spark
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.