Steve Ballmer is taunting IBM and Oracle with challenges of "You don't know Big Data." Yahoo this week announced it's spinning off its own Hadoop-focused company, cleverly named Hortonworks, after - I assume - another famous pachyderm. LexusNexus is touting its Hadoop-alternative, HPCC Systems, which it plans to release as open source after taking 10 years to develop it.
And at least once a week, there's yet another announcement about a vendor offering Hadoop plug-ins or Hadoop connectors or Hadoop "what'cha'ma'bob" for their business intelligence tool, data integration platform or what-have-you.
Meanwhile, there will be roughly 1.8 trillion gigabytes pocketed in 500 quadrillion files by the end of the year, says IDC - all of it, waiting to be picked, processed and integrated for the analyzing.
Still, as promising as Hadoop is, Kobielus writes in a recent blog post, it's time to add some perspective before we O.D. on Big Data and Hadoop hype:
At times, it almost feels like people discuss Big Data with the assumption that bigger is necessarily better and that throwing more data at your problems will automatically produce insights. I hope business and IT professionals heed my advice about searching for those special problems, often of a scientific nature, that can be solved best through petabyte-scale analytics. You don't need a data center full of maxed-out storage arrays to derive powerful insights. Gut feel is free, and it often thrives on the scantiest information.
Throughout June, Kobielus wrote a series of blogs asking such pause-worthy questions as "What Are These Big Bad Insights That Need All This Nouveau Stuff?" and "Hadoop: What Is It Good for?"
The result is a reality check about Hadoop and Big Data in general. What I like about Kobielus's post is he assesses Hadoop honestly without tearing it down or diminishing its contribution. He truly is just putting it into perspective.
Here are some of the questions he asks:
All good questions, to which I would add one more: Do you have a Hadoop expert hidden away somewhere? IT Business Edge's Susan Hall wrote about the shortage of IT workers with this skill. While IBM, Informatica, SnapLogic and others are offering tools to help you access and process Hadoop-stored data with their tools, it's still something you'll want to investigate.
This is not to say that Hadoop is overhyped and not worth your time. It's just that there are issues you need to consider. It's not quite enterprise-ready, or, as Kobielus puts it to his clients, "... yes, Hadoop is real, but ... it's still quite immature."