From the course: Learning Data Science: Understanding the Basics

Uncover insights and create knowledge

From the course: Learning Data Science: Understanding the Basics

Uncover insights and create knowledge

- Over the last 20 years, most organizations have focused on increasing their operational efficiency. They've worked hard to streamline their business processes. They wanted to be leaner and more flexible. They asked operation questions like how can we work fast and smarter. Data science is different. It doesn't focus on achievable objectives. It's exploratory and uses a scientific method. It's about gaining useful business knowledge. With data science you ask different types of questions. What do we know about our customer? How can we deliver a better product? What are we doing better than our competitors? What would happen if we left the market? These are all questions that require a higher level of organizational thinking. Most organizations aren't ready to ask these types of questions. They're filled with employees who are focused on giving answers and setting milestones. They haven't been rewarded for being skeptical or exploratory. Think about the last time you were in a business meeting. Now imagine someone in the room asks a few questions. Why are we doing it this way? What makes you think that this will work? Why is this a good idea? Chances are this person asking would be seen as annoying. Usually someone will bark back didn't you read the memo? Yet these are the skills you need to build organizational knowledge. These are the questions you want from your data science team. Still most people in organizations are focused on getting things done. It will be a change to explore new possibilities. Questions are often seen as a barrier to moving forward. As an organization, you'll only gain knowledge by asking interesting questions. I once worked for a website that connected potential car buyers to dealers. On the site, they created hundreds of information tags. These tags showed whether the customer was hovering or clicking on their links. All this data flowed into a Hadoop cluster. They gathered terabytes of data every week. The company had historical data going back years. They spent large sums of money and even set up departments to focus on collecting and maintaining the data. Collecting the data was easy. The software they used was simple and easy to create. The hard part of figuring out what to do with the data. They were just starting to think about types of questions they could ask. The seems like a common challenge for many organizations starting out in data science. These organizations see it mostly as an operational challenge. In the beginning it's about collecting the data. They spend their effort gathering as much data as they can and then putting it into the Hadoop cluster. These projects inevitably focus on the technical side of data. They don't really think about the data science as science. This makes sense because collecting data is relatively cheap and pretty straight forward to understand. Everyone can get behind the effort. They'll even create multiple clusters to pool their data from all over the organization. Many more organizations struggle with the science. They're not used to asking interesting questions. Imagine the questions this website could ask? What if they can an experiment that changed the color of the cars? They could see if their customers were more likely to click on an image if it were red, blue, or yellow. What if the report showed that customers are two percent more likely to click on car if it's red? Then they could share with the car dealerships and generate new revenue. What if they could run an experiment to see if the site is putting to many cars on the page? They could try to add fewer cars, then run a report to see if it increased the likelihood that a customer would click on the link. This type of empirical research is what a data scientist should be thinking about all the time. They should twist the dials of the data then ask interesting questions, run experiments and produce well-designed reports. Data science can start with a question but then you should run experiments. Once you run your experiments, then you can use spreadsheets and software to create reports. You can then look at the report and see if you're any closer to gaining real insights.

Contents