Join Alan Simon for an in-depth discussion in this video Looking at traditional data warehousing, part of Foundations of Business Analytics: Prescriptive Analytics.
- Over the years our applications have moved from centralized mainframes to distributed and decentralized computing. We've introduced mini computers and personal computers and mobile devices into the mix for our enterprise applications, and as a result, we've wound up with an islands of data problem. What that means is that the critical data we need for reports and analysis, instead of being consolidated together is scattered among numerous different applications and systems that happen to reside on many different platforms, in many different formats, in many different organizations and geographies.
The consequence of our islands of data problem is that when we need to build reports that require data from more than one different source, it's typically been, over the years, a very difficult and time-consuming effort. The early 1990s introduced us to the concept of data warehousing, in which we've attempted to break through these islands of data and bring selected data from each of our different applications together into a single store of information, and that's where we will run our reports and do our analysis.
This picture shows a very simple data warehousing architecture where we have three different applications, one for taking orders over the telephone, another one for doing orders online, as well as a master list of all of our customers. And selected data from each of those systems is fed and copied into the data warehouse. When building a data warehouse, organizations typically use their internal applications, their customer relationship management, their supply chain management, their ERP, enterprise resource planning applications, or whatever their applications happen to be.
And increasingly over the years, they've also brought in external data sources to help create these data warehouses and make them very powerful for reporting and analysis. As data is brought into the data warehouse, one of the things that typically has to happen is the transformation of that data, or in other words, the unification of that data. Different applications may use different encodings for the same concepts, and therefore, to bring the data into a data warehouse, we need to come up with some sort of a unified coding.
So for example, if one application for gender uses m and f for male and female, another one spells out male and female, and a third application uses one for male, two for female, one of those encodings needs to be selected for how all the data in the data warehouse will be stored. And as the data from other applications that use a different scheme are brought in, the transformation occurs to convert the data and make it all look the same.
One of the key things about a data warehouse, and this will be very important as we look at analytics, is that not all data is brought in from all the different applications into the data warehouse. Typically, a very rigorous requirements analysis process has to occur at the beginning, in which we determine the types of reports and the types of analytics we're going to do out of that data warehouse, and then we figure out, based on those specific requirements, which data we will go after from our various sources.
We also load the data in multiple phases. At the very beginning, when a data warehouse is first being created, an initial set of data will be imported into the data warehouse. But as we go on and use the data warehouse on a regular basis, we continually update the data warehouse to keep the data relevant. We'll bring in new customers, the latest orders, the latest returns. Whatever the data happens to be, it continually is updated and refreshed in the data warehouse so our reports and analytics contain the most current information.
When we load data into the data warehouse, we organize it by subjects. So we'll keep all of our products together, all of our customers together, keep our orders together, our returns, even if we have, for example, our orders and our returns scattered among different applications that all happen to be feeding their data into the data warehouse. We structure the data in such a way that we can slice and dice our reports. We can look at orders by customer, by month.
We can look at returns by geography, by month. All kinds of different dimensional analysis will be able to take place out of the data warehouse. And once the data is brought into the data warehouse and it's organized and structured the way we want it, it's used to produce all kinds of different reports and dashboards, which are maintained and updated on a regular basis. The reason we've been building data warehouses since the early 1990s is to provide one-stop shopping for our data within the enterprise, regardless of where that data happens to originate and no matter how many different applications the data happens to reside in.
- Exploring the analytics taxonomy
- Understanding prescriptive analytics fundamentals and workflow
- Looking at data warehousing and business intelligence
- Exploring big data
- Collecting and processing data
- Exploring triggering events
- Formulating business hypotheses
- Refining and enriching business hypotheses
- Reaching definitive conclusions
- Putting the finishing touches on prescriptive analytics