It’s not too hard to find use cases for Hadoop, but many seem to focus on how Internet companies use Big Data. That’s great if you’re LinkedIn, CBS Interactive or Four Square, but what about the rest of us?
What do you do with Hadoop if you’re the CIO of a more traditional enterprise IT division?
Loraine Lawson has been reading a lot about this while researching “7 Enterprise-Friendly Ways of Dealing with Big Data,” from our sister site, Enterprise Apps Today.
She found two common Hadoop uses that could be deployed by any IT department in any industry while researching that article. Since writing it, she’s found more great examples of how CIOs can use Hadoop with existing systems, for a total of eight ways any IT division can use Hadoop.
Click through for eight ways any IT department can use Hadoop, as identified by Loraine Lawson.
In this scenario, the data is processed and filtered in Hadoop clusters, then fed downstream to a traditional large data warehouse, OLAP cubes or an in-memory analytics platform.
You can use Hadoop clusters as a staging layer for storage behind an EDW or data mart. In fact, Hive is an example of a data warehouse infrastructure that’s built on top of Hadoop clusters.
This is one way companies are using Hadoop to filter through social media “noise” to find the good stuff that’s worth knowing about. To do that, you’ll need to couple Hadoop with a sentiment engine.
Hadoop doesn’t just handle large amounts of data — it’s fast. Already, some companies are generating so much data in a day that it actually takes their ETL solutions longer than 24 hours to process it. By using Hadoop and MapReduce to perform the ETL process, they’re able to significantly reduce the time it takes to process that data.
Loraine has been quoting this excellent post by Ravi Kalakota recently, and this is what he says is one of the three primary use cases for Hadoop. What’s cool about Hadoop in this situation is that you can add new data to existing data without having to reindex the entire cluster.
Sometimes, you want to archive data, but you also want to be able to access it without the hassle of sending for and uploading the archives. Hadoop allows you to store large amounts of historical data without the tapes, giving you access to that data at any time, Kalakota points out.
If you really want to search all your enterprise data, build an indexing infrastructure on top of Hadoop. It scales easily, so it will grow as your data grows. Plus, thanks to the distributed parallel architecture, it’ll be fast, according to Cloudera.
Data warehouses are big, but unwieldy, which means if you want to put something in them, you need a plan. Hadoop is much more flexible, so some companies are using it to create a data sandbox where users can play with the data, and then if they find something worthwhile, they can add that query to the data warehouse. This use case should appeal to any company striving to be more “data-driven.”