From the course: Big Data Analytics with Hadoop and Apache Spark
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Total score analytics
From the course: Big Data Analytics with Hadoop and Apache Spark
Total score analytics
- In this video, we will compute the total score for each student by subject and print the total scores for Physics for all students. To compute total score, we will use the map transform in the data frame. We can simply use the withColumn function to compute the new column from existing columns and create the total score in the data frame. We then print the results. Let's execute the code now. We can see the total scores computed correctly. We can also look at the execution plan to understand how the read was executed. We can also look at the Spark Job UI to see how this map operation was executed. We can see that this is a simple map operation. Next, we print the total score for Physics for all students. This is a simple filter that we execute on the subject column. Let's execute this code. We see the scores for Physics printed for all the students. If you look at the execution plan, it shows that…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.