From the course: Python Data Science Mistakes to Avoid
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Using redundant features - Python Tutorial
From the course: Python Data Science Mistakes to Avoid
Using redundant features
- [Instructor] Another mistake to avoid in machine learning is using redundant features, which affects your model's performance. This can result in overfitting, lack of clarity in feature importance, and increase in computational time. For example, let's say that my goal is to predict whether a student has instructor one or instructor two. Say that I have access to a dataset that contains the SID, which stands for student ID, calculus 1A grade, calculus 1B grade, trigonometry grade, algebra grade, geometry grade and instructor, which is either one or two for each student from a set of students. This will serve as the training data. Then, say I pick calculus 1A grade, calculus 1B grade, trigonometry grade, algebra grade, and geometry grade to be the features, and I built a model accordingly. The model will use a student's Calc1A, Calc1B, trig, algebra and geometry grades to predict whether they have instructor one or…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.