From the course: Cloud Hadoop: Scaling Apache Spark

Unlock the full course today

Join today to access over 22,500 courses taught by industry experts or purchase this course individually.

Import data

Import data - Apache Spark Tutorial

From the course: Cloud Hadoop: Scaling Apache Spark

Start my 1-month free trial

Import data

- [Instructor] All right, I'm going to show you a tip from the real world. Even though you can change your code and run your code in the Databricks notebooks and if your code is incorrect, you'll see the errors. Debugging and troubleshooting is really difficult. So, as a practicality, I generally author my notebooks or update them in a full on editor, so I'm going to show you that process. And I'm using Visual Studio Code. So we're going to take that same use case, which is WordCount and look at it in Python. So as you may remember from a previous movie, the metadata is designated by the Python comment so on lines one and two for example. And, specific to Databricks, is the metadata around line four, which is the command. So inside of here you can see the Python method starts on line six. You've got the print function, so the word occurs x number of times basically. And then we're going to input a file and I just…

Contents