SHARE

Cloudera Adds Columnar Storage Engine to Hadoop

Uncovering the Truth about Six Big Data Security Analytics Myths At the Strata + Hadoop World 2015 conference this week, Cloudera announced that it has begun beta testing a columnar data store, dubbed Kudu, that will run natively on top of Hadoop, while providing access to a public beta of a roles-based access control mechanism […]

Written By

MV

Mike Vizard

Oct 1, 2015

Uncovering the Truth about Six Big Data Security Analytics Myths

At the Strata + Hadoop World 2015 conference this week, Cloudera announced that it has begun beta testing a columnar data store, dubbed Kudu, that will run natively on top of Hadoop, while providing access to a public beta of a roles-based access control mechanism for Hadoop called RecordService.

Mike Olson, chief strategy officer and chairman of the board for Cloudera, says that via the columnar data store, Cloudera wants to enable the deployment of faster types of data analytics on top of Hadoop. As such, Kudu is not a replacement for the Hadoop Distributed File System (HDFS) as much as it is another type of storage engine that a different class of applications can now natively invoke, says Olson.

In general, columnar data stores are usually used with analytics applications that are closely associated with data warehouses. By adding support for a columnar store on top of Hadoop itself, Cloudera is again signaling that Hadoop will over time usurp most of the databases that are currently used to support data warehouse applications.

Longer term, Olson says Cloudera envisions a world where real-time analytics created using Apache Spark as the programming environment for Hadoop will run in-memory along with data being generated by transaction processing systems. The end result should be a way to attach analytics to transactions in real time without having to manage the same level of complexity associated with processing transactions on relational databases that we have today.

While Olson says the industry as a whole is a long way from making this actually possible, it does represent an avenue of research that Cloudera plans to continue to investigate.

In the meantime, Cloudera continues to be committed to unifying as much of the analytics processing in the enterprise around Apache Spark, Kudu, its Impala SQL engine and Hadoop as possible.

MV

Mike Vizard

Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

Cloudera Adds Columnar Storage Engine to Hadoop

Mike Vizard

Company

Categories