Join Robin Hunt for an in-depth discussion in this video Common troubleshooting methods, part of Learning Data Analytics.
- No one likes to start blind and having no way to know if their data is right, and if you do, you actually scare me a little bit. I wanted to put together some troubleshooting techniques that we use. Always verify your work on the source data, and create a method to do that. Those methods may vary, depending on what work you do, or what data you have available to you. There's always something that you can do to verify your work. Data people also find themselves writing queries on the data to make sure that it's clean and correct.
The number and amount of queries you might write cannot be determined by me, but your data will tell you. Before you start, know what conditions can affect your mathematical calculations. Be sure you check that each condition delivered the expected outcome. Don't just trust because numbers showed up that everything is right. Look at the last month's reports or the existing reports that were used prior to you coming into the role as the analyst. They can be a wealth of information.
Use smaller subsets of data to actually test your calculations. I know that if I'm looking at pulling test scores for 500,000 children in various grades, and applying a level of proficiency to them, I want to make sure it works on 1,000 kids first. If you have the great fortune of the DBA's who built the system, or the experts in the system you work with, you can ask them for independent verification and show them how you tied the data together to spot check and make sure you have no issues.
As this course is tool-agnostic, trust me when I tell you there are plenty of troubleshooting methods that you can use with each tool that you work with. This just gives us a common place to start.