From the course: Spark for Machine Learning & AI
Unlock the full course today
Join today to access over 22,700 courses taught by industry experts or purchase this course individually.
Components of Spark MLlib - Apache Spark Tutorial
From the course: Spark for Machine Learning & AI
Components of Spark MLlib
- [Instructor] The MLlib package has three types of functions. The first is machine learning algorithms. The set of algorithms currently includes algorithms for classifications, which is for categorizing something, such as a customer likely to leave for a competitor. Regression, which is used for predicting a numeric value like a home price. Clustering is used to group similar items together. Unlike classification, there are no predefined groups, so this is really useful when exploring data. And finally, there's topic modeling, which is a way to identify themes in a text. The second group is workflows. Workflow components help organize commonly used steps, like pre-processing operations and tuning. This makes it easy to run a sequence of steps repeatedly while varying some parameters of the process. Utilities are lower level functions that give you access to distributed linear algebra and statistics functions. In this essentials course, we'll concentrate our efforts on working with…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.