Multilayer perceptrons are a form of neural network. In this video, learn how to implement a multilayer perceptron for classification.
- [Instructor] Now we're going to work with a multi-layer perceptron, which is a type of neural network. Now we're going to start where we left off in our previous video. Now if you recall, we have a dataframe with our iris flower dataset, and I called that iviris_df, and that refers to the indexed and vectorized version of the iris data. So we have that dataframe, and let's just take a look at the first row, just refamiliarize ourselves with the data structure.
Okay, so our rows consist of four measures of sepal length, sepal width, petal length, and petal width. And indication of the species, which is a strain, and then we also have our feature vector which takes those four measurements and put them into a single vector and we also have label that maps the string species name into a number, zero, one or two. The other thing that we had created was some training data and test data. So we took our iris data, our iviris dataframe, and we split it into a training dataframe which had about 90, 92 rows in it, and a test dataframe which had the remainder of the iris data in it.
The combination of the train and the test total 150 rows which is what we would expect because the iviris dataframe also has 150. Okay, so that's where we're starting. I'm just going to clear the screen because now it's time to import some code that we'll need to support multi-layer perceptrons. Now, the way a multi-layer perceptron classifier works is that we have, as the name implies, multiple levels of neurons.
Now in all cases, the first layer has the same number of nodes as there are inputs. So for us we have four measures so our first layer will be four. So I'm going to create a list of layers. I'm going to set the first element to be four. Now the last element should have the same number of neurons as there are types of outputs. Now in our case there's three types of iris species. So our last row will be three.
Now we want to have layers in between, and the layers in between will help the multi-layer perceptron learn how to classify correctly. So I'm going to insert two rows of five neurons each. So we are going to have a four layer multi-layer perceptron. First layer will have four neurons, the middle two layers will have five neurons each, and then the output layer will have three neurons. One for each kind of iris species.
Now I'm going to create a multi-layer perceptron. So I'll just clear the screen for that. So I'll create an object called mlp, which will be our instance of the multi-layer perceptron classifier. And when we create it we want to indicate the layers. And we'll just use that layers list that we just created. And also the multi-layer perceptron uses a random number generator so I'm going to set the seed for that, and I'll set it to one.
And now I have this multi-layer perceptron and I want to build or fit a model around that. So I'll create a model object called mlp_model, and I'll take that multi-layer perceptron object that I just created and ask it to fit my training data. So what I've done is I've created a multi-layer perceptron and I've built a model using my training data. So now I want to make some predictions using the test data.
So I'll create some multi-layer perceptron predictions, and I'll create that by calling our mlp_model and applying the transform, and I want to transform my test data, our test_df. Right, so now what I've done is I've created the model and I've made some predictions. So let's go to the next step which is to evaluate those predictions. First thing I want to do is create an evaluator.
And I'll call it mlp_evaluator. And this is also a multiclass classification evaluator, just like we used with naive bayes. And the metric we're going to measure is accuracy. And now let's calculate the accuracy and we'll create an object to hold the answer.
So we'll use the mlp_evaluator we just created, and we'll invoke the evaluate function, and we'll evaluate the predictions we made. And now if we take a look at mlp_accuracy we'll see that our accuracy is actually quite higher than a naive bayes. Here we're at about 93%, so that's a significant increase in quality over what we had with naive bayes.
- Machine learning workflows
- Organizing data in DataFrames
- Preprocessing and data preparation steps for machine learning
- Clustering data
- Classification algorithms
- Regression methods available in Spark MLlib
- Common approaches to designing recommendation systems