Hewlett-Packard today moved to couple the HP Vertical columnar databases more closely with Hadoop. With the launch of HP Vertica for SQL on Hadoop, organizations that make use of HP columnar databases can now store files using the Hadoop Distributed File System (HDFS).
Steve Sarsfield, product marketing manager for HP Vertica, says that now that SQL has reemerged within analytics as the dominant query language for both columnar databases and Hadoop, it only make sense to more tightly tie the two platforms together programmatically. To that end, HP has also delivered an SDK and exposed application programming interfaces (APIs) to make it easier to build analytics applications in addition to its previous creation of a HAVEn offering that combined Hadoop with HP Autonomy, HP Vertica, HP ArcSight and HP Operations Management software.
This latest effort to integrate Vertica with Hadoop is compatible with all distributions of Hadoop. Earlier this year, however, HP announced it made a $50 million equity investment in Hortonworks, which provides a widely used distribution of Hadoop. Hortonworks filed an initial public offering last week. Originally spun out of Yahoo!, Hortonworks also has existing Hadoop alliances with Microsoft, Teradata and SAP.
Sarsfield notes that usage of HDFS is likely to increase substantially as organizations move to build “data lakes” based on Hadoop. Rather than store data in multiple file systems, it makes more sense to make all the data stored using HDFS available directly to other applications versus always transferring data between multiple formats.
IT organizations for the most part are still working out what it is they actually intend to do with Big Data. But the one thing that is clear is that to make it easier to build Big Data applications in the future, IT will need to centralize access to all that data. As such, it may not be too long before HDFS becomes the single most dominant file system in the enterprise.