Learn how to group machine learning algorithms. This video covers regression algorithms, instance-based algorithms, and many more types.
- [Instructor] Let's look at how to group machine learning algorithms. Grouping algorithms is just really helpful in order to put some extra context around what you're learning, so that it's easier to remember and reference later. The reason why we really need to understand the basic groupings of machine learning algorithms is honestly, there's over 100 algorithms and over 100 machine learning cases. I mean, I think, there's probably thousands of machine learning use cases. So, it's important to understand ways to group them. In order to decrease the time-to-value by hastening your model selection process. See, when you understand your data and the outcome you need to get from it. Then, you can quickly identify potential machine learning algorithms that you can use. And the data pre-processing steps that will be required. This all helps to sharply decreased time-to-value. Now, there are a few ways to group machine learning algorithms. And it's worth the time to understand these ways, because they each have their own benefits. So, you could group machine learning algorithms according to learning style. And those learning styles are supervised, unsupervised, and semi-supervised. You could also group them according to function. So, the modules of this course are broken up according to machine learning function. Those functions would be regression, clustering, and dimension reduction. Association rules, deep learning, instance-based. As well as decision trees, Bayesian, ensemble methods, And regularization models. Lastly, you could group machine learning algorithms according to use case. You can use machine learning algorithms for, like I said, over 100 use cases. Some of those include fraud detection, recommendation engines, price forecasting, inventory demand forecasting, water consumption forecasting, infrastructure demand forecasting. And so, so much more. But if you just think about it, imagine that you have a particular type of problem you're trying to solve. Say you're working on a fraud detection problem. It's actually very, very beneficial for you to know what kind of machine learning methods actually have been used in the past to solve this type of problem. So that you don't go in and have to reinvent the wheel. Right? It's good to kind of, in your mind, start assigning machine learning methods to use cases. But also functions and then learning styles is actually the most rudimentary way of grouping them. With grouping by learning style you've got supervised learning, unsupervised learning, and semi-supervised learning. In supervised learning you are going to be making predictions straight from the labeled data. In unsupervised learning you are making predictions from unlabeled data. And then, in semi-supervised learning you use labeled data and unlabeled data to make a set of predictions. We're going to be talking about all of this stuff later in the course. Let's look at how you can use grouping by learning style in order to actually add value when you're carrying out an analysis. So, imagine that you look at your data and you notice your data has a continuous distribution and it's labeled. A general rule of thumb is that you could use regression methods of this type of data. And then, if it was continuous but unlabeled then you may want to explore dimension reduction as a way to generate predictions from the data. On the flip side, if your variables are categorical and your data is labeled, then you may want to look at the classification methods. But if your data is unlabeled, then you would pull in some clustering methods. Now, please note that these are fuzzy groupings. So, it's really easy to come up with exceptions to these rules. But just understanding how to group machine learning methods by learning style is helpful in speeding up the amount of time it takes for you to pick algorithms to test and see how they perform. Let's look at supervised learning. We'll look first at regression methods. Regression is a form of supervised learning that enables us to study labeled data and quantify linear relationships between at least one predictor variable X and one response Y. In contrast, you've got instance based classification. And this is where you used a classified learner to make predictions from labeled data in order to predict categories for new data points. The class labels are discrete, unordered values. In terms of supervised learning we've got clustering and dimension reduction. We're going to be talking about both of these in detail later. But clustering takes unlabeled, categorical data and organizes it into clusters based on similarity. Dimension reduction takes unlabeled, continuous, high dimensional data and compresses it into a smaller synthetic representation. So, those are unsupervised methods. Now, let's look at an example of grouping machine learning models by use case. So, let's look at k-means clustering. And I'm just going to list out a number of potential use cases to which you could assign k-means clustering as a potential predictive solution. Those would be location-based prime prediction and prevention, customer segmentation for marketing, document classification, insurance fraud, fraud detection and prevention, IT operational efficiencies, oil and gas site evaluations, video analytics, and patient segmentation for predictive medicine. Now, if you we're looking at a particular use case, and trying to figure out of all the machine learning methods which ones lend themselves well to generating predictive solutions for that specific use case. You could do something like this. Saying you're looking to make predictions of fraud. Cases of fraud in the insurance industry, for example. You've got insurance fraud prediction as your use case. And then, the different methods that actually apply towards that would be, as we discussed above, k-means clustering would be one of those methods. However, you could also use time series analysis, multi-criteria decision-making, logistic regression, support vector machines, and decision trees. So, there are many, many other types of machine learning models that can also be used to satisfy that use case. However, because we understand k-means clustering. We can now assign it to use cases and just get a kind of quick win about methods we could potentially use in this type of requirement.
- Why use Python for data science
- Machine learning 101
- Linear regression
- Logistic regression
- Clustering models: K-means and hierarchal models
- Dimension reduction methods
- Association rules
- Ensembles methods
- Introduction to neural networks
- Decision tree models