Free Paper Puts Hadoop in Enterprise IT Context

Loraine Lawson
Slide Show

Why the Hoopla over Hadoop?

Hadoop in nine easy-to-understand facts.

Hadoop can do a lot of really cool things, but there are use cases beyond the big number crunching that tends to make the tech press headlines. And, it's often talked about in isolation, which can make it hard to figure out how Hadoop would "fit in" with the rest of your data infrastructure.


If you need a bit of help understanding Hadoop for the regular organization, you might want to check out a recent Informatica Perspectives blog post and the white paper it references. The post is written by Brett Sheppard, executive director at Zettaforce, who recommends a five-step action plan for starting with Hadoop.


The advice is pretty down-to-earth stuff, with a few specifics. For instance, tip one is to "Select the Right Projects for Hadoop Implementation," and explains that Hadoop is often used for exploring "what if" situations with unstructured data, but may not be quite ready for real-time yet. He also talks about how you'll need to adapt your existing architectures to use Hadoop, the challenges of finding people with the right skills to use Hadoop, and why Hadoop isn't the place to tackle data quality and governance issues. He also mentions that you should adopt "lean and agile integration principles."


It's worth a quick skim, but to be honest, I think you have to read the paper he references, "Technical Guide: Unleashing the Power of Hadoop with Informatica," to really follow what he's saying, particularly if you have more questions than answers about Hadoop's use.


Now, we all know by now that white papers are primarily a marketing tool these days, but I really liked this one because I felt it devoted more space to education and information than to selling Informatica's platform. For instance, it includes some wonderfully handy charts for explaining the function of the open source "companions" of Hadoop - and specifically how they fit in with data integration - as well as an explanation of Hadoop use cases and the typical customer profiles for these use cases. There's even a high-level reference architecture graph so you can see how Hadoop might fit in with your existing data architecture.


If you're one of the 54 percent that Ventana Research says either have deployed or are considering deploying Hadoop, you should definitely check out page 7, which lists the "Challenges with Hadoop." As you might expect, it's not all bubbles and rainbows. Among the issues:


  • Hadoop lacks metadata management and that creates problems with auditability and transparency.
  • Data quality and governance control is also a concern, since Hadoop doesn't have built-in support for either.
  • It turns out, the way most organizations handle data transformations with Hadoop can cause problems when you try to separate the data from its location.


It's a technical guide, so IT leaders should find it includes a satisfying level of detail about how Hadoop will work with existing data architecture and processes, but there's still enough discussion of business value to interest executive business leaders. And did I mention it's free? Although, you do have to provide the usual user profile information to download it.

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


Add Comment      Leave a comment on this blog post

Post a comment





(Maximum characters: 1200). You have 1200 characters left.




Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.

Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.