A common mistake in machine learning is choosing features that will not be available in the future. In this video, learn how to avoid this mistake by choosing features that will be available during testing and when your model is run on unseen data.
- [Tutor] A common mistake in machine learning … is choosing features that will not be available … in the future. … It is important to choose features … that will be available during testing, … and when your model is run on unseen data. … For example, let's say that my goal is to predict … whether a student has instructor one or instructor two. … Say that I have access to a dataset that contains the SID … which stands for Student ID, math grade, … science grade, history grade, and instructor, … which is either one or two for each student, … from a set of students. … This will serve as the training data. … Now say I pick math grade, science grade, … and history grade to be the features … and I build a model accordingly. … The model will use a student's math, … science, and history grades to predict … whether they have instructor one or instructor two. … Next, say I encountered this testing data. … This data does not include students history grades. … It could either be that some students …
Skill Level Intermediate
1. Avoid Mistakes in Coding Practices
2. Avoid Mistakes in Structuring Code
3. Avoid Mistakes in Handling Data
4. Avoid Mistakes in Machine Learning
Using redundant features1m 45s
Get started with Python1m 7s
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.