Hadoop's Future: The Challenges of Adoption and Coming Consolidations

Loraine Lawson
Slide Show

Why the Hoopla over Hadoop?

Hadoop in nine easy to understand facts.

It seems Forrester Research was being generous when it predicted we'd see buyouts in the Big Data space by the end of summer 2012.

 

Yesterday, Oracle announced it had acquired Big Data analytics firm Endeca. Wired notes this after mocking HP for its August acquisition of Autonomy, an infrastructure software company that handles unstructured data.

 

Forrester believes other data warehouse players such as SAP, Microsoft and Teradata, will soon join in, acquiring Hadoop startups. It's worth noting that Forrester's prediction came just days ahead of news that Microsoft's SQL Server 2012 will include Big Data processing capabilities, based on - what else? - Hadoop.

 


InformationWeek's Fritz Nelson writes that Microsoft will offer Hadoop on Azure and Windows Server, and there will be Hadoop connectors for SQL Server and SQL Parallel Data Warehouse. What's more, Microsoft plans to offer drivers for Hive ODBC, a batch processing system used for querying Hadoop stores, so the data can be connected to Microsoft's BI tools.

 

Big Data is a Big Deal for Big Vendors who see Big Money ahead. They want to nab control of this trend while it's still evolving. So, it makes sense that this space would be ripe for consolidation, particularly when you look at the immaturity of the tools used to manage Hadoop.

 

But despite all this rush to embrace Hadoop and Big Data, Forrester says there are a number of factors that will make Hadoop difficult for business:

  1. There is no single, integrated Hadoop software distribution.
  2. Hadoop's core specifications are still being developed by the Apache Community, and thus far, those specs still don't address metadata, high availability, federation and machine learning - which in turn means you may be buying open source software that uses proprietary functions to deal with these problems.
  3. Many Hadoop projects require custom coding and Forrester's James Kobielus warns there can be a steep learning curve for developers.
  4. There are no industry-consensus best practices for Hadoop.
  5. Many of those in enterprise data analytics and IT don't know how to use Hadoop and will struggle with adoption, a situation that's further complicated by a lack of certified training or services programs.

 

As big vendors move into Big Data, these challenges will become their problem as well. It'll be interesting to see how - and if - traditional enterprise vendors address these problems.

 

Hadoop may have challenges, but its distributed approach to data is the future of enterprise computing, predicts Michael Driscoll, CTO of the data analytics company Metamarkets.

 

Driscoll also raises some interesting points about what the rise of Hadoop and NoSQL means for Oracle. Basically, he questions whether Oracle can make the transition to a flexible, cloud-savvy company.

 

Oracle does offer a Hadoop connector and recently revealed Exalytics, an appliance for large-scale analytics. But Driscoll contends one big vendor - Oracle - is betting on the wrong side of this shift, with its big-box, desktop-based approach:

Metal server boxes don't bend or expand; they are inelastic, both physically and economically. In contrast, the needs of businesses are highly elastic; as companies grow, they shouldn't have to unpack and install boxes to meet their compute needs, any more than they should install generators for more electricity. The ability to scale storage and compute capacity up or down, within minutes, is liberating for individuals and cost-effective for organizations, but it is impossible with a cloud in a box.' It is only enabled by a true cloud computing infrastructure, with virtualization and dynamic provisioning from a common pool of resources.

ComputerWeekly offers a nice round-up of these key issues, which are covered more thoroughly in Forrester's recent report "Enterprise Hadoop Best Practices: Concrete Guidelines From Early Adopters In Online Services." Unless you're a Forrester client, it'll cost $499.



Add Comment      Leave a comment on this blog post
Oct 21, 2011 2:40 AM Mark Mark  says:

At Cloudera, we understand the challenges enterprises face when implementing a big data solution.  This is why Cloudera University provides a robust training curriculum for Apache Hadoop and related technologies, as well as the preeminent certifications for Hadoop Developers and Administrators. 

For more information, check out www.cloudera.com/hadoop-training. 

In fact, Cloudera U. just made an announcement about our training endeavors-(with more news to come shortly).

www.cloudera.com/company/press-center/releases/cloudera-university-launched-amidst-unprecedented-growth-cloudera-training-certification-for-apache-hadoop.

Additionally, Cloudera's Support and Professional Service teams provide expert hadoop solutions across numerous industries.   The diversity and depth of experience these teams bring to customer engagements accelerates successful hadoop deployments. 

Please check out www.cloudera.com to learn more about how we make our customers successful. 

Reply

Post a comment

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

null
null

 

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.