From the course: Cloud Hadoop: Scaling Apache Spark
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
SparkR - Apache Spark Tutorial
From the course: Cloud Hadoop: Scaling Apache Spark
SparkR
- [Instructor] In this section, we're going to we working with SparkR, and I recommend that you get the example notebook from my GitHub. So, where you want to go for this is into my GitHub, learning-hadoop-and-spark, and then you want to go into Use-Spark, and Jupyter-Notebooks, and aws_databricks_notebooks and SparkR. There has been some changes to the default release in Spark in terms of working with R, and I found that the documentation that was online wasn't updated. So I updated the documentation and tested this notebook with Spark 2.2. In the next notebook, we're going to look at yet another library. So let's go to Workspace and import, and make this so we can see the background, and let's work with the R language. And click import, and then I'll maximize this. So, the R language is more specialized than some of the other libraries, for example the SQL that we just looked at. It's used very heavily in the…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
-
-
(Locked)
Spark SQL8m 34s
-
(Locked)
SparkR6m 54s
-
(Locked)
Spark ML: Preparing data4m 21s
-
(Locked)
Spark ML: Building the model3m 50s
-
(Locked)
Spark ML: Evaluating the model3m 41s
-
(Locked)
Advanced machine learning on Spark1m 35s
-
(Locked)
MXNet25s
-
(Locked)
Spark with ADAM for genomics2m 5s
-
(Locked)
Spark architecture for genomics2m 1s
-
(Locked)
-
-
-