Learn about the difference between traditional programming and machine learning.
- [Narrator] As a computer programmer your job is to write the rules that tell a computer exactly how to solve a specific problem. Machine learning is a different approach. Machine learning is where the computer itself learns the rules to solve a problem without being explicitly programmed. Let's start with an example we are all familiar with, junk email. Imagine you are writing a program to filter out junk email from your inbox using traditional programming. First, you'd have to write a complicated program that contains all the rules to decide if a particular email message is junk or a real message. For example, the program might look for certain keywords that you think would only appear in junk email, or you might have the program check if the sender of the email was someone you emailed before.
Next, you try out the program by feeding in some test emails. Finally, you check the results of the program to see if it correctly separated real emails from junk emails. In this process, the hardest part is figuring out which rules help identify an email as junk email or real email. It will take a lot of trial and error to come up with the right rules that accurately identify junk email without any false positives. Even worse, when the spammers change their tactics and start sending junk emails that are designed to get around your rules you have to go in and update your program again to catch them. It's going to take a lot of ongoing maintenance to make this work.
It would be much nicer if the computer could come up with it's own logic for filtering emails, and that's what we can do with machine learning. Here's how the machine learning solution would work. First, we'd gather thousands of emails and sort them into two groups. One group are the emails that we know are real. The other groups are known spam emails. Next, we feed these emails into a machine learning algorithm. The machine learning algorithm is an off the shelf system. We don't have to write any custom code to make it work. The machine learning algorithm will look at the two groups of emails and create it's own rules for how to tell them apart. This process is called training.
We are giving the machine learning algorithm input data the original emails and the expected output, whether each email should be classified as real or spam and it creates it's own rules for how to re-create the output from the input data. The more data it sees during training the better chance it has of learning how to do this accurately. Once the model is trained we can now use it to sort emails that it has never seen before. When we show an unknown email it will apply the rules it learned during training to correctly classify the email as real or spam. With machine learning we didn't have to do the hard part ourselves. We didn't have to write any email filtering rules.
The computer came up with those on it's own based on the training data it saw. The really cool part about machine learning is that the same algorithm that we used to classify email can be use to solve lots of other kinds of problems just by changing the data we feed into it. We don't have to change a single line of code. For example, instead of feeding in emails and marking them as spam or not spam we could just as easily feed them pictures of hand written numbers. The algorithm could decide which number each picture represents, whether it's a zero or a one or in this case an eight. The same algorithm that does email filtering can be used to do handwriting recognition.
With traditional programming you give the computer exact instructions on how to solve a problem. The computer can only do exactly what it has been previously programmed to do. With machine learning it's different. The computer learns how to do new things without you having to explicitly program it. Instead, you show the computer data and the computer learns from the data how to approximate functions that you would have had to program in by hand. Machine learning is a great solution for many complex real world problems that are hard to solve with traditional programming.
- Setting up the development environment
- Building a simple home value estimator
- Finding the best weights automatically
- Working with large data sets efficiently
- Training a supervised machine learning model
- Exploring a home value data set
- Deciding how much data is needed
- Preparing the features
- Training the value estimator
- Measuring accuracy with mean absolute error
- Improving a system
- Using the machine learning model to make predictions