From the course: Data Science Tools of the Trade: First Steps

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Weka and Spark

Weka and Spark

From the course: Data Science Tools of the Trade: First Steps

Start my 1-month free trial

Weka and Spark

- Although, running WEKA in its stand-alone mode is perfectly fine for many of our daily data-sized tests, we do need to tap into the power of distributed processing from time to time, especially when our dataset falls in the realm of big data. Since we already have well-known tools like Spark available for distributed processing tests, it would be ideal if WEKA can leverage the technology and WEKA does provide a way to harness the power of Spark. Let's see how we can go about configuring WEKA to take advantage of Spark. From the gui chooser, go to tools and select package manager. In the package search window, type Spark and press enter. Choose distributed WEKA Spark. Click install and click yes. Click okay. Click yes. Once you install the Spark package, we need to restart WEKA. Close the package manager, close WEKA gui chooser. Let's restart WEKA. Click okay. To allow you to run a distributed WEKA job in a user-friendly way, WEKA provides an option called KnowledgeFLow. Click on…

Contents