Analyze high-volume data using R, the programming language optimized for big data. Learn how to produce visualizations, implement parallel processing, and integrate with SQL and Apache Spark.
- [Mark] Your research uses big data, data so big, it can't be loaded onto a computer, data so big, it takes hours for our code to analyze it, data so big, plots are overwhelmed with huge clouds of gray spots. Data this big is called high-volume data, and its sheer size will block your research. Hi, I'm Mark Niemann-Ross, and in this course, R Programming in Data Science: High-Volume Data, we're going to explore big data.
We'll talk about what makes high-volume data difficult to work with and explore strategies and code to solve those problems. We'll look at charts and graphs and how to make visual sense of overwhelming data. We'll learn to write fast code with the R programming language, paying special attention to careful use of memory and large datasets. We'll look at parallel processing and using R with more than one CPU. Finally, we'll look at how to integrate R with other high-volume solutions, such as cloud computing, databases, and Apache Spark.
High-volume data is a valuable part of your research, but only if you can use the tools to manipulate and analyze the large datasets. So let's get started with R Programming in Data Science: High-Volume Data.
- Accessing memory and processing power
- Visualizing high-volume data
- Profiling and optimizing R code
- Compiling R functions
- Parallel processing with R
- Using R with other big data solutions