Join Keith McCormick for an in-depth discussion in this video Master documentation, part of The Essential Elements of Predictive Analytics and Data Mining.
- [Instructor] It's terribly important that you produce good documentation. If the data miner is the only one understands it, it's not data mining. A great best practice to establish is to write a report at the end of each cross industry standard process for data mining phase. We'll be talking about CRISP-DM in the next chapter. It's a six phase process for completing these kinds of projects. No one will admit to enjoying this part of the process, but it can save you a lot of grief if agreements are reduced to writing and not simply left in verbal form.
One could even argue, without being too philosophical, that agreements don't even really exist until they're reduced to writing. In the later stages, the importance is not only restricted to insuring that you're on the same page. The modeling report and the deployment report are absolutely critical forms of knowledge transfer from the data miner to the organization as a whole. What kinds of things do you have to document? Well, which variables were used in the final model, as opposed to which ones were considered but not used? Where exactly is the data coming from? Where are the predictive scores being sent? Also, what kind of training will be necessary for end users? It could be sales reps learning how to interpret a new value that's automatically populating their sales management software.
Or helping nurses interpret a risk score as they go through a checklist when patients are about to leave the hospital. These documents should be understandable, even if the author of the documents transitions to a different department, or even a different organization. I've actually recorded videos for clients during the final phases of a project to supplement the report. Holding little workshops is common. It's actually not a bad idea. It's terribly important to do this documentation.
Otherwise, deployment is going to fail.
- What makes a successful predictive analytics project?
- Defining the problem
- Selecting the data
- Acquiring resources: team, budget, and SMEs
- Dealing with missing data
- Finding the solution
- Putting the solution to work
- Overview of CRISP-DM