Join Alan Simon for an in-depth discussion in this video Understanding architectural alternatives, part of Transitioning from Data Warehousing to Big Data.
- As you start to determine your organization's future state of analytics and data management, one overarching question needs to be answered. Should big data and Hadoop have a role in your organization's future state architecture? Despite everything we've discussed about Hadoop as a game changer for analytics and data management, there actually are some valid reasons not to invest in Hadoop, at least now. If your analytical needs are overwhelmingly descriptive, in other words, traditional Business Intelligence and there's little need at present for predictive and discovery analytics, then the power of big data and Hadoop may not be of much value for now.
If your data usage is highly compartmentalized and fragmented, your reports are done along departmental lines rather than cross-functionally, the power of bringing data together into a big data environment may not necessarily be of interest or of value. If within your organization reports are very seldom used, especially by executives, and it turns out that decisions are often made more by force of personality than any other factor, then the investments in Hadoop or any other data integration technologies will probably not be of much interest.
And then, finally, some organization have a culture that's apathetic or even adversarial to doing what the data says. And again, there's not much reason at this point to invest in big data or Hadoop. You might be asking yourself, Could big data and analytics change that? If people see how it works, might that change their mind? The answer is possibly, but the best way to do that is to start with analytics, not necessarily big data. For example, if you have an Enterprise Data Warehouse, but you're mostly doing traditional Business Intelligence right now, pull data into a statistical and analytical data mart, built on SAS or SPSS or another package, and start experimenting with predictive and discovery analytics.
Even if you wind up pulling data directly into that data mart and bypassing the Enterprise Data Warehouse, you might have to go back and change that at some point, but at least for now you can start to build some enthusiasm for what analytics can do. In fact, you don't even need an Enterprise Data Warehouse, you can implement a statistical and analytical data mart very quickly, bring the data in as it's needed, and eventually you'll have to rework this architecture, but along the way you can start to build some enthusiasm and interest in advanced analytics.
So big data, for all of it's power, may not necessarily be the thing that your organization needs right now. Suppose though, that Hadoop does make sense for you. What do you do then? Remember that we have two primary architectural alternatives for bringing Hadoop into the Enterprise Data Warehouse space. We can use Hadoop as a super-sized data staging area in front of a relational Enterprise Data Warehouse, and that staging area also does double duty as an analytics sandbox where we can start to experiment with analytics.
Or we can take a plunge and build our next generation Data Warehouse on top of Hadoop, rather than relational technology. How do you decide which of these is better for your organization? What you want to do is take a number of factors into consideration including, the results of your assessment that we discussed earlier, any organizational initiatives and mandates related to data usage and analytics, the skills that you have within your organization right now, as well as those you can obtain out in the market place.
Let's look at some examples of each of these and what those might mean to making your architectural decisions. If, upon reviewing the results of your current state assessment, you discover that your current business intelligence capabilities are actually rather mature and effective, but you don't do much today with predictive and discover analytics, it could be that the best place to get started is to leave your Enterprise Data Warehouse in place, but start implementing Hadoop in front as a staging area and an analytics sandbox.
If however, across the board your analytics are immature and ineffective, either of the alternatives would be a good one for you to at least get you moving in the right direction. It may be that currently, your analytics across the board aren't too bad, but they're highly fragmented, they're primarily departmental or functional, and not cross-organizational. In this case, building a new Hadoop based Enterprise Data Warehouse may be the best approach for you to gain the synergies across your different families of analytics, and to start promoting cross-functional analytics.
You need to make sure that your technology architecture is aligned with all of the initiatives and mandates around your enterprise. If it's been decided that going forward, all decisions and actions will be driven by data, you don't necessarily have enough information at this point to choose an alternative, but you do know that one way or another, Hadoop and big data and analytics will be very important to you. If your organization has plenty of Business Intelligence for after the fact reporting, but it's now been mandated that each of those reports will be accompanied by recommended actions, then Hadoop as the new Enterprise Data Warehouse would probably be a good choice, because you can then align your descriptive analytics with your predictive and discovery analytics more closely.
Or there could be a mandate that every last little bit of value will be squeezed out of your current systems, even while new technology is being invested in, in this case, applying Hadoop as a new staging area in concert with your existing Enterprise Data Warehouse would make sense. Another factor would be the skills available to help build out your systems. If you already have a world-class BI and Data Warehousing team with deep expertise in your current platforms, it could be that you want to retain your existing Enterprise Data Warehouse architecture for a while, even while your implementing Hadoop.
So that gives you some guidance in that direction. At the same time, if you have limited internal skills currently, but there is a commitment to train and hire as necessary, you might want to consider going down the path of building a brand new Enterprise Data Warehouse based on Hadoop. The important thing to remember though, is that even if Hadoop makes sense for your organization, there's no 100% definite answer as to exactly how you should do that. And you need to take a look at a number of different factors to help guide you in your architectural decisions.
- Exploring big data, Hadoop, and analytics
- Examining the shortcomings of traditional data warehousing
- Comparing big data architectures for next-generation data warehousing
- Understanding alternatives
- Building a roadmap
- Managing big data-driven projects
- Monitoring and measuring success