From the course: Data Science Tools of the Trade: First Steps

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Spark: pyspark

Spark: pyspark

From the course: Data Science Tools of the Trade: First Steps

Start my 1-month free trial

Spark: pyspark

- [Instructor] Spark offers other alternatives for its default shell, and PySpark is one of them. Before we try PySpark, let's first make sure that Python is installed. Type python in the terminal window and press enter. If you get a message like what you see here, you need to install Python. Type apt, hyphen, get install, Python. Press enter. Type Y and press enter. Now try Python again. Type Python and press enter. Looks like Python has been successfully installed. Type exit, E-X-I-T, open parentheses, close parentheses yes. To start PySpark, type PySpark and press enter. If you see the screen like this, you're good to go. Now, let's try to create an RDD, by typing lowercase text, T-E-X-T, uppercase F, lowercase I-L-E, space, equals sign, space, lowercase S-C, dot, lowercase T-E-X-T, uppercase F, lowercase I-L-E, open parentheses, double quotation mark, read me, R-E-A-D-M-E all capital, dot lowercase M-D. Double quotation, close parentheses. Here, S-C stands for Spark Context. Press…

Contents