Did you develop Big Data in a silo?
It’s okay. You can be honest here. You’re among friends. In fact, it’s a safe bet you’re not alone, since experts were predicting this might happen back in 2012. All the signs suggested organizations were developing Big Data in a sandbox; by default that means Big Data often became yet another data silo.
So you’re in good company if you developed your Big Data analytics in a silo, beyond your regular systems.
It does beg the question, though: What now?
You’ve got some hard work ahead of you, contends Lockwood Lyon, a systems and database performance specialist. In a recent Database Journal article, he explains that integrating Big Data systems will require much more than simply scaling up hardware.
It will be difficult to integrate with the current data warehousing architecture for a number of reasons, he explains, including:
- Data variations and new data types, such as large objects (LOBs), and multi-structured data (images, audio, etc.)
- Processing needs
- Analytical requirements
“The conclusion: Big Data today is not only a scale-up issue; it is also a re-architecture issue and a data integration issue,” Lyon writes. “Further, it often involves integration of dissimilar architectures. When we insist that we can deal with Big Data by simply scaling up to faster, special-purpose hardware, we are not only neglecting the real issues: we are leaving our current processes and data — the little data — behind.”
Lyon also helps walk through the issue of integration, from learning more about the source systems to data storage for analytics.
He recommends involving the database managers and others who understand these “little data” systems. For the most part, Lyon discusses the more tactical aspect of integrating Big Data analytics into existing systems, and that’s important.
Lyon’s approach comes from a traditional IT perspective. In other words, you’ve built this new system and now you need to integrate it with everything you did previously. But it does raise a major strategic question: Do you really need to adapt the Big Data systems as a traditional integration project, or would it be better to look ahead and think about how you might want to adapt your infrastructure around Big Data technology?
To use a Star Trek simile (because why not?): Does Big Data join the federation of IT systems while more or less retaining its separate identity? Or is Big Data like the Borg, absorbing and transforming all it touches?
The Borg example may sound extreme, but that’s more or less the maturity model that Edd Dumbill, vice president of strategy at Silicon Valley Data Science, mapped out for Hadoop in a recent Forbes article. As I pointed out in a previous blog, he’s not the only one to see Hadoop as a potential application development platform.
You might not be ready to embrace this vision of Big Data yet, but it’s worth keeping in mind as you move forward with a broader adoption of it.