Hadoop can do a lot of really cool things, but there are use cases beyond the big number crunching that tends to make the tech press headlines. And, it's often talked about in isolation, which can make it hard to figure out how Hadoop would "fit in" with the rest of your data infrastructure.
If you need a bit of help understanding Hadoop for the regular organization, you might want to check out a recent Informatica Perspectives blog post and the white paper it references. The post is written by Brett Sheppard, executive director at Zettaforce, who recommends a five-step action plan for starting with Hadoop.
The advice is pretty down-to-earth stuff, with a few specifics. For instance, tip one is to "Select the Right Projects for Hadoop Implementation," and explains that Hadoop is often used for exploring "what if" situations with unstructured data, but may not be quite ready for real-time yet. He also talks about how you'll need to adapt your existing architectures to use Hadoop, the challenges of finding people with the right skills to use Hadoop, and why Hadoop isn't the place to tackle data quality and governance issues. He also mentions that you should adopt "lean and agile integration principles."
Now, we all know by now that white papers are primarily a marketing tool these days, but I really liked this one because I felt it devoted more space to education and information than to selling Informatica's platform. For instance, it includes some wonderfully handy charts for explaining the function of the open source "companions" of Hadoop - and specifically how they fit in with data integration - as well as an explanation of Hadoop use cases and the typical customer profiles for these use cases. There's even a high-level reference architecture graph so you can see how Hadoop might fit in with your existing data architecture.
If you're one of the 54 percent that Ventana Research says either have deployed or are considering deploying Hadoop, you should definitely check out page 7, which lists the "Challenges with Hadoop." As you might expect, it's not all bubbles and rainbows. Among the issues:
It's a technical guide, so IT leaders should find it includes a satisfying level of detail about how Hadoop will work with existing data architecture and processes, but there's still enough discussion of business value to interest executive business leaders. And did I mention it's free? Although, you do have to provide the usual user profile information to download it.