While there seems to be a general consensus that the open source Apache Hadoop framework for managing Big Data has a lot of potential value, the issue that many IT organizations are trying to cope with is finding ways to make Hadoop more accessible.
The primary mechanism for invoking Hadoop is MapReduce, a set of interfaces created specifically for manipulating large amounts of data across a cluster of systems. At the same time, there are also a handful of companies that have made Hadoop accessible via more traditional SQL requests.
But Charles Zedlewski, vice president of products for Cloudera, a provider of data management tools built specifically for Hadoop, says mass adoption of Hadoop will really come once it's routinely embedded inside applications, which will by definition provide a layer of abstraction that will mask the complexity of Hadoop from the average end user.
As part of this effort, Cloudera recently began shipping version 3.0 of its Hadoop distribution, which, among other things, includes support for ODBC drivers that many enterprise applications already heavily rely on to access SQL databases. Most of those applications will initially manifest themselves in the form of advanced analytics applications that will leverage Hadoop's inherent ability to scale by simply adding additional nodes.
As the hype and mystery surrounding Hadoop start to dissipate, the actual hard work of creating Hadoop applications in the enterprise is actually just beginning. Version 3.0 of Hadoop goes a long way to create the kind of stable environment that enterprise IT organizations tend to favor, so don't be surprised to see a raft of Hadoop applications all coming to market later this year that will increase the amount of data that can be cost effectively analyzed by a thousand fold.