Hadoop or Warehousing, or Both?

Arthur Cole
Slide Show

Top Predictions for Big Data in 2014

One of the thornier questions facing enterprise executives in these days of broad infrastructural change is how to deal with Big Data.

On the surface, it may seem like a no-brainer: No matter how big the data load becomes, there are always cloud-based resources to deal with. But that is only part of the equation because the Big Data challenge is not one of capacity as much as management and analysis.

The traditional approach, of course, is the data warehouse. Companies like Teradata continue to churn out new products and new technologies aimed not only at storing vast quantities of structured and unstructured data, but supporting sophisticated algorithms and other tools to help the enterprise make sense of it all. The new Active Enterprise Data Warehouse 6750 platform continues this tradition by upping both capacity and performance with Flash memory and advanced, real-time analytics capabilities. At the same time, it reduces both the energy and space requirements of earlier platforms, offering the enterprise the means to green up the warehousing infrastructure.


Teradata is also pursuing new virtual and software-defined architectures through products like QueryGrid, which offers inroads to competing database and warehousing platforms, namely Hadoop, without having to physically import relevant data. As well, the EDW platform’s Database 15 module supports protocols like the JSON (JavaScript Object Notation) interchange that reduces the complexity involved in moving workloads into and out of EDW. And with support for Perl, Ruby and other scripting languages, enterprise and third-party developers now have the means to tap the system’s in-database analytics prowess.

But this still leaves many people scratching their heads: Why deploy an entire warehousing platform when you can get the same analytics capabilities at less cost from a pure-software approach like Hadoop? The answer, as tech analyst Richard Winter explained to the New York Times recently, is that the two are not interchangeable. Hadoop may be fine for non-critical data like web browsing records and sensor feeds, but high-value stuff like customer records, product information and business transactions need a higher level of care that only full warehousing can provide.

Indeed, even the top Hadoop developers are positioning their products as complements to full-scale warehousing, not replacements, says MongoDB’s Matt Asay. The old strategy of “embrace, extend and extinguish,” in which new products gradually weed out the old as technology and user requirements evolve, does not apply here – at least, not the “extinguish” part – because traditional warehousing will continue to provide unique and vital services to the enterprise data environment for some time to come.

So the good news for CIOs going forward is that they don’t have to make an either/or decision when it comes to traditional or software-based Big Data analytics. The bad news, of course, is that they will have to pursue both strategies if they hope to develop a thorough analytics capability that is tailored to the specific needs of various data sets.

Warehousing and analytics may be intertwined, but they are not one and the same.



Add Comment      Leave a comment on this blog post

Post a comment

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


 

Resource centers

Business Intelligence

Business performance information for strategic and operational decision-making

SOA

SOA uses interoperable services grouped around business processes to ease data integration

Data Warehousing

Data warehousing helps companies make sense of their operational data