From the course: Python Data Science Mistakes to Avoid

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Using redundant features

Using redundant features - Python Tutorial

From the course: Python Data Science Mistakes to Avoid

Start my 1-month free trial

Using redundant features

- [Instructor] Another mistake to avoid in machine learning is using redundant features, which affects your model's performance. This can result in overfitting, lack of clarity in feature importance, and increase in computational time. For example, let's say that my goal is to predict whether a student has instructor one or instructor two. Say that I have access to a dataset that contains the SID, which stands for student ID, calculus 1A grade, calculus 1B grade, trigonometry grade, algebra grade, geometry grade and instructor, which is either one or two for each student from a set of students. This will serve as the training data. Then, say I pick calculus 1A grade, calculus 1B grade, trigonometry grade, algebra grade, and geometry grade to be the features, and I built a model accordingly. The model will use a student's Calc1A, Calc1B, trig, algebra and geometry grades to predict whether they have instructor one or…

Contents