Understand when to really consider migrating to a big data platform. It's the buzz about town now, but big data comes with a lot more than you may need or want so tread carefully and do some research before deciding to go this route.
- With big data it's easy to get caught up in the hype. It's easy to read a bunch of blogs or articles about the amazing things that these companies have done with these systems, but I believe this has led to a myth that big data is necessary for your business. Now, true big data, I would say is a problem and it's one that you'd be lucky to not have to worry about you see, there are pros and cons here and when you're working with true big data, and by what I mean there is huge volumes of data, petabytes upon petabytes upon petabytes, what happens is a lot of the things that you're used to getting done in maybe hours or minutes, previously with your smaller data sets now can take days or even weeks and can sometimes cause really challenging propositions for your architecture.
Let's take a look at the Pros and Cons though. First, with these platforms we can handle real-time data so we can process that data, analyze it, and even build machine learning models to respond to events as they occur, think fraud protection for your credit card. This is a great Pro. This is something that you previously couldn't do, but it's not something that everybody needs. Another Pro of big data systems is that we can handle unstructured data, whereas before most of the data you're probably dealing with is structured.
It lives in a database and most traditional database systems have a defined structure. You can't just dump data in there like you can a file system. Big data systems you can. You define the structure after the fact, which is actually a great benefit because it allows you to capture all the data before you have to think about which data is particularly useful at that moment in time and I can tell you from experience that it is much better to have the data than to not have the data way down the road when you're trying to answer a question or figure out what happened and realize that you chose not to capture that data because it didn't make sense at the time.
You're better off capturing all the data and this is a great reason to use the big data system. Another Pro is new product offerings. With big data platforms, because we have new capabilities like I just mentioned, you often can develop and build entirely new features for your product or entire new products themselves using this complexity and structure and volume of your big data platform, but there are Cons too. So one of them is that there's a lot of setup involved and there could be a lot of maintenance.
I would say at least you're going to have one to two people full time just making sure your system, it continues to function and continues to give you the response times and handle the volume and do everything that you expect it to be doing. Another Con is that big data systems often leave you with an unfamiliar interface, so if you're used to using SQL, you can set that up on your big data platform but it might be a little different than what you're used to, it might be slower and some of the functions and things may not work, whatever analytics or BI tools you may have, they may not connect depending on which route you go, so there's something to consider there.
Is this really going to be the Holy Grail of data platforms or is this going to give us more trouble than it's worth? Now, the thing about big data, and one of the big benefits of it is that you can store all the data. This leads to huge data volumes, again petabytes and petabytes of data, more data than you know what to do with. Now that data can be extremely hard to comb through at times and some of the clients I've worked for in Silicon Valley I can tell you that running a simple query to get a count of people in a specific subset of their user base can take hours, not because it's a complex query but just because the volume is so large that it takes a system an extremely long time to process that one simple calculation so when you think about big data, know there are a lot of Pros, there are a lot of Cons and you really need to weigh those before you decide you want to go down that path.