SHARE

EMC Tightly Couples Hadoop to Massively Parallel SQL Database

Recognizing that Hadoop and SQL database technology need to be joined at the hip in the enterprise, EMC Greenplum today announced Pivotal HD, an implementation of the company’s massively parallel database that is now integrated with the Hadoop Distributed File System (HDFS). Eight Ways to Put Hadoop to Work in Any IT Department According to […]

Written By

MV

Mike Vizard

Feb 25, 2013

Recognizing that Hadoop and SQL database technology need to be joined at the hip in the enterprise, EMC Greenplum today announced Pivotal HD, an implementation of the company’s massively parallel database that is now integrated with the Hadoop Distributed File System (HDFS).

Eight Ways to Put Hadoop to Work in Any IT Department

According to Josh Klahr, vice president of product management for EMC Greenplum, the benefit of this approach is that it allows organizations with massive investments in SQL to start using low-cost Hadoop implementations as a data warehouse.

Rather than having to learn an arcane MapReduce interface, Klahr says that Pivotal HD is designed to allow IT organizations to run high-performance Hadoop applications using a SQL syntax that is already commonly known throughout the enterprise, versus requiring them to invest in a data scientist that is fluent in MapReduce.

Klahr says Pivotal HD delivers query response time improvements that range from 10 times to 600 times faster than current SQL options for Hadoop.

EMC Greenplum is not the only company trying to tightly couple SQL to Hadoop these days. As this trend continues to evolve, it’s becoming clear that Hadoop will soon be replacing relational databases across a swath of data warehousing applications. What’s not clear is what role SQL will play exactly. There are those that argue that Hadoop, by its very nature, eliminates the need for ad-hoc SQL queries. Instead, the algorithms in the Hadoop application will discover patterns and anomalies. The schema will then be generated by Hadoop as part of the read operation, as opposed to traditional SQL data warehouse applications that generate schemas as part of the write operation.

Naturally, it will take some time for this transformation of data warehousing to play out, so SQL will remain relevant for some time to come. But the one thing that is for certain is that Hadoop in the enterprise will forever change the way IT organizations think about building data warehousing applications.

MV

Mike Vizard

Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

EMC Tightly Couples Hadoop to Massively Parallel SQL Database

Mike Vizard

Company

Categories