In this video, learn what a multi-layer perceptron is from a conceptual level and explore what is going on under the hood.
- [Instructor] In this first lesson in the multilayer perceptron chapter, we're going to learn a little bit about what a multilayer perceptron is. The definitions in this section are going to be a little bit vague, but we're going to jump into a visual representation and hopefully as we walk through that, it'll become a bit more clear. So multilayer perceptron is a classic feed-forward artificial neural network. of some deep learning algorithms. Not all algorithms in deep learning use a feed-forward artificial neural network, but many do. Another way to look at this, is that a multilayer perceptron is a connected series of nodes, where each node represents a function. a directed acyclic graph. Meaning that there is directionality between the nodes and no node will ever be revisited. Okay, enough terminology, let's look at an actual visual representation. So this all starts with an input layer. Let's use our Titanic data set example. There are four nodes here and each node would represent a feature that we want to use to predict whether somebody survives or not. So these could be age, ticket class, cabin and sex. Next there's a hidden layer that we're representing with five nodes here. But this hidden layer could contain 10 nodes, a hundred nodes, or even a thousand nodes. Now each of these nodes represents some function. So what happens is, we take each of our input nodes, representing features and we pipe those into each node in the hidden layer. So what's happening here is each node, or function, in the hidden layer is getting each input feature. So age, ticket class, cabin and sex are passed into each node in the hidden layer. So you could view each of these nodes as something like logistic regression. So more or less a logistic regression model would be fit for each node made with slightly different weights. Then we have an output layer. And here we're going to have two nodes Either a person survived, or they did not survive, using the example of our Titanic data set. So now we have each node in the hidden layer, outputting to each node in the output layer. So let's take a step back and just isolate one node in the hidden layer to understand what's happening here. So you can see it's receiving the four inputs. Then this one node in the hidden layer will basically fit some model, or learn some aspect of the data. Then based on that model that it learns, it'll make some prediction about whether somebody survived or not. So maybe it's 70% likelihood that they survived, 30% likelihood that they did not. So that's one individual node in the hidden layer. But we could repeat this slide, for each of the five nodes in the hidden layer. and you get a final prediction of the likelihood of a given person surviving. As you can imagine, these can be incredibly powerful. As you see in this example, you're essentially aggregating the predictions of five different models all together. This allows the overall model, to learn some really powerful relationships in the data.
- Models vs. algorithms
- Cleaning continuous and categorical variables
- Tuning hyperparameters
- Pros and cons of logistic regression
- Fitting a support vector machines model
- When to consider using a multilayer perceptron model
- Using the random forest algorithm
- Fitting a basic boosting model