Join Alan Simon for an in-depth discussion in this video Reviewing BI/DW history, part of Transitioning from Data Warehousing to Big Data.
- Data warehousing is an incredibly popular and successful discipline that's been providing significant business value for more than 20 years since the early 1990s. The primary purpose of Data Warehousing is to integrate data from many different applications and systems, in a formal and well architected manner. Rather than having to pull our data together haphazardly. As was the case, before we started building Data Warehouses. Business intelligence is a companion discipline that we use to pull data from our Data Warehouses after it's been loaded.
And Business intelligence came on the scene at almost exactly the same time that data warehousing did in the early 1990s. Once we build our Data Warehouses, pulling data from whichever source systems will be providing the data, we then make it available for reports, and dashboards, and other uses. So we can understand, based on our data, what's going on in our companies and organizations so we can then, ideally, make better decisions and take actions upon that information. It's important to get an idea of where data warehousing and Business intelligence came from.
Because as we transition into this entirely new era of big data and analytics, many of the lessons and many of the historical aspects of BI and data warehousing are very important, as we build out our new generation of systems. If we go back to the 1970s and 1980s we tried a number of different approaches to getting insights from our data, given the technology of the day. We had Management information systems, Decision support systems, and Executive information systems.
And all of these were different software packages that would draw data and help produce some level of reports and analytics, to the best extent possible. Statistical software packages also became popular. And we made extensive usage of what were called extract files on a need-by-need basis. Here's what we mean by extract files. If we have a relatively simple enterprise, with eight different applications, suppose we have four different families of reports that we need to produce for Telephone Orders, what's happening in our Retail Stores, our Distribution Centers, and then what our Sales vice president might need to see.
With extract files we would pull data from the different systems. But we would wind up duplicating those efforts many times over. And for example, our East Region Telephone Order System would feed data into multiple databases. And so would almost all the others. The end result was a hodgepodge of data feeds, most of which were not coordinated with one another. And we would usually wind up with non-standardized definitions and conflicting information across these different reports.
Another key problem with extract files was that we were copying data that already existed. And as technology was really taking off in the 1980s and into the early 1990s, one of the things that we tried to do was avoid copying data, if we didn't necessarily have to. Most vendors began building what were called Distributed Database Management Systems. In which, rather than copying data at the point where reports and analytics would be formed, they would actually reach out over the computer networks into the systems themselves, to try to pull out the data at the point it was needed.
For a number of different reasons, the technology of the day just wasn't mature enough and Distributed Database Management Systems quickly fell out of favor. However, there are some aspects of Distributed Databases which live on today in big data environments. And we'll look later at how those concepts are still used. What we did then was look at this idea called Data Warehousing. In which, we tried to bring together ideas and concepts from both Distributed Database Management Systems as well as the earlier extract files.
Instead of reaching out into the systems themselves we copied data as we did during the extract file era. But what we would do is instead of replicating those data feeds all over the place, with conflicting information, we would try to build one or maybe a small number of Data Warehouses around our company. So this way we would have standardized definitions and a smaller number of data feeds which were easier to build and maintain. We built data warehouses for one very important reason, and that was to provide Business Intelligence capabilities.
Business Intelligence means the Reports, the Dashboards, the Online analytical processing, the slicing and dicing of our data to look at it a number of different ways, as well as visualization. So, for example, looking at data on top of a map rather than in a tabular report. Today, businesses are in the early stages of an entirely new era of data-driven insights, and decisions, and actions, driven by big data and modern analytics. But the lessons of how Business Intelligence and data warehousing evolved are critically important to understanding how best to make use of these new technologies and approaches.
- Exploring big data, Hadoop, and analytics
- Examining the shortcomings of traditional data warehousing
- Comparing big data architectures for next-generation data warehousing
- Understanding alternatives
- Building a roadmap
- Managing big data-driven projects
- Monitoring and measuring success