One of the thornier questions facing enterprise executives in these days of broad infrastructural change is how to deal with Big Data.
On the surface, it may seem like a no-brainer: No matter how big the data load becomes, there are always cloud-based resources to deal with. But that is only part of the equation because the Big Data challenge is not one of capacity as much as management and analysis.
The traditional approach, of course, is the data warehouse. Companies like Teradata continue to churn out new products and new technologies aimed not only at storing vast quantities of structured and unstructured data, but supporting sophisticated algorithms and other tools to help the enterprise make sense of it all. The new Active Enterprise Data Warehouse 6750 platform continues this tradition by upping both capacity and performance with Flash memory and advanced, real-time analytics capabilities. At the same time, it reduces both the energy and space requirements of earlier platforms, offering the enterprise the means to green up the warehousing infrastructure.
Teradata is also pursuing new virtual and software-defined architectures through products like QueryGrid, which offers inroads to competing database and warehousing platforms, namely Hadoop, without having to physically import relevant data. As well, the EDW platform’s Database 15 module supports protocols like the JSON (JavaScript Object Notation) interchange that reduces the complexity involved in moving workloads into and out of EDW. And with support for Perl, Ruby and other scripting languages, enterprise and third-party developers now have the means to tap the system’s in-database analytics prowess.
But this still leaves many people scratching their heads: Why deploy an entire warehousing platform when you can get the same analytics capabilities at less cost from a pure-software approach like Hadoop? The answer, as tech analyst Richard Winter explained to the New York Times recently, is that the two are not interchangeable. Hadoop may be fine for non-critical data like web browsing records and sensor feeds, but high-value stuff like customer records, product information and business transactions need a higher level of care that only full warehousing can provide.
Indeed, even the top Hadoop developers are positioning their products as complements to full-scale warehousing, not replacements, says MongoDB’s Matt Asay. The old strategy of “embrace, extend and extinguish,” in which new products gradually weed out the old as technology and user requirements evolve, does not apply here – at least, not the “extinguish” part – because traditional warehousing will continue to provide unique and vital services to the enterprise data environment for some time to come.
So the good news for CIOs going forward is that they don’t have to make an either/or decision when it comes to traditional or software-based Big Data analytics. The bad news, of course, is that they will have to pursue both strategies if they hope to develop a thorough analytics capability that is tailored to the specific needs of various data sets.
Warehousing and analytics may be intertwined, but they are not one and the same.