From the course: Python: Working with Predictive Analytics

Road map - Python Tutorial

From the course: Python: Working with Predictive Analytics

Start my 1-month free trial

Road map

- [Instructor] Throughout this course, I use the term prediction. It's often used to refer an informed guess or opinion. In order to reach that goal, we will follow a roadmap. This roadmap is based on Cross Industry Standard Process for Data Mining, or CRISP-DM for short, which is a widely used method for planning data science projects. In our roadmap there are six stages and I'll briefly describe what happens in each stage here. And then as you work through the course, you can see the stages in greater detail. The first stage is business understanding. It's really important to have an understanding of the problem you are trying to solve and the advantages it will bring to the table. A car without a destination cannot do much, even if it's a Mercedes, BMW, or Tesla. So first things first. We need a destination, a goal. For example, what will be the number of sales of a product in the next quarter? Or what's the average number of days a used car stays on the market in Oregon? Before working on a project, define your goal. Next, data understanding. You must be familiar with your data before you can work with it. If you do not have enough data for this problem, making the right plan to collect the right data is the key here. After that, understanding the types of data is key. So, you know how to process it for your predictions. Then data preparation. In this stage, it's important to make sure the data is able to processed by the prediction models. You might prepare data by handling missing values, processing the outliers, and applying normalization or standardization techniques. And finally turning categorical data into numerical data is needed, since most models only accept numerical data. Next is the modeling stage. This is when all your hard work can be put to work by making the predictions. You divide the data into test and train. Use training data to train the model and then use the test data to evaluate the model's success score. Evaluation is the next step and this is testing to see if the model does a good job using the test data. There are different evaluation methods you can use to predict the success of the model. Sometimes evaluation helps you decide that your model isn't successful. In this case, you would move back to the modeling stage. Many times in this process you may find yourself moving back and forth between modeling and evaluation stages. Finally, the last stage is deployment. All your hard work can be deployed because you found the successful model to solve your problem. A lot of the times this map is not a straight one line trail. We will have a lot of loops back and forth, especially in the evaluation and modeling parts where we will have to go back and improve the prediction in order to achieve a higher evaluation score. Keep in mind, evaluation metrics are helpful, but at the end of the day there is also a human aspect, subject matter expert opinion, which needs to be taken into account before making any major final assessment. So if you are ready, let's start our journey on this prediction trail.

Contents