From the course: Machine Learning and AI Foundations: Classification Modeling

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Training and test partitions

Training and test partitions

From the course: Machine Learning and AI Foundations: Classification Modeling

Start my 1-month free trial

Training and test partitions

- [Narrator] Okay, now we're gonna talk about a critical concept and you're gonna want to master how to do this in your software of choice. But at a high level, the basics are very straight forward. What you're gonna do is take the historical data, on which you're gonna build the model, and you must divide that data, or partition it, we say, into training data and testing data. What we all learn to do, initially, when we first get accustomed to this, is that we do a 50% train and a 50% test. These partitions are chosen at random. What you'll notice, however, when you're watching others perform this task, is that sometimes the numbers won't be 50/50. The general rule is straight forward, and it is this. When your sample size starts to get small, you increase the percentage on the train and reduce the percentage on the test. Because it still has to add up to 100%, so you'll often see 70/30 or 80/20 and these are not unusual choices for training and testing partitions. Okay, now, some…

Contents