One of the challenges that technologists are still working out is the easiest way to put unstructured data to work for most end users.
It’s particularly a head-scratcher for business intelligent analysts, who want to combine the unstructured data stored in Hadoop with the more defined data structure we all know and love.
John O’Brien recently took an in-depth look at some possible solutions in an excellent Inside Analysis piece.
It seems that BI analysts are still working out the details when it comes to combining Hadoop’s unstructured data with the structured data we find in traditional enterprise BI tools and data warehouses. The big issue: What’s the best BI integration architecture to use with Hadoop?
As O’Brian explains, BI experts are focusing on three specific architectures:
While this piece is not overly technical — I could definitely follow it — it’s not a lightweight read, with specific solutions and suggestions for how you can actually make Hadoop work with your BI tools and enterprise applications. This makes it an excellent resource for BI analysts or any IT manager who wants to move beyond “talking about Hadoop” stage.
One solution he devotes a good amount of space to is HCatalog, which really sounds like a game-changer for BI professionals and Hadoop.
Yahoo developers created HCatalog to add two important capabilities to Hadoop data stores:
1. A table abstraction of Hadoop stores, making it a bit more familiar and easier to use
2. A metadata service for Hadoop
“Data stored in Hadoop is no longer confined to only those few who possess MapReduce skills. A larger portion of the BI community can explore data and allow their definitions to be leveraged in HCatalog for many more users of traditional tools,” O’Brien writes. “HCatalog is a key milestone in the maturing of Hadoop for the industry and helps bridge user access through the use of metadata.”
If you’d like more ideas on how you can get started with Hadoop, I explored some easy options in “7 Enterprise-friendly Ways of Dealing with Big Data,” which appeared on our sister site, and more recently here on IT Business Edge in “Eight Ways Any IT Division Can Use Hadoop.”