Hadoop is shifting from a sandbox experiment to mainstream use in large enterprises. IT Business Edge’s Loraine Lawson asked Jack Norris, vice president of marketing at Hadoop distribution vendor MapR Technologies, how enterprises are really putting this Big Data technology to use.
Lawson: Analysts have said organizations are still kind of playing in the sandbox with Hadoop and many have not integrated it with existing systems. Are you finding that to be true with Map R’s clients or are they starting to integrate it more and think more broadly?
Norris: What we’ve seen is people are taking Hadoop and looking at new data sources and new application for those data sources, looking at some existing applications and getting more granular with those. Those tend to be in revenue-related or supply chain or risk-oriented. In the existing systems that are deployed, they're doing some things that have really fast, immediate payback, like doing an ETL offload into a large data warehouse where they can do the transformations on Hadoop and then load directly from there.
Lawson: Is it a fairly simple process to pull the data and get it into Hadoop?
Norris: One of the issues is getting data into Hadoop requires using the Hadoop distributed file system API. There are specially designed connectors and load processes to get data into Hadoop. Now, one of the innovations out there that Map R has done is to change that underlying data so it supports not only the Hadoop API, but it supports industry-standard protocols like NFS, meaning you can use any existing process and tool and write directly to the cluster.
Lawson: What about business intelligence tools? I know that I am always getting information about analytics tools that are adding connectors to Hadoop. What are you seeing in terms of the challenges there or what people are using it for? How is it changing BI?
Norris: There are some of the same similar trends going on … if you have a BI process, then what Hadoop can do is really increase the data volume or it can be used to pull and compare a variety of data sources. So I’m going to pull social media and combine it with financial trade information and get a better idea of sentiment analysis.
Lawson: Can you explain?
Norris: One market segment where Hadoop is being used is the digital advertising media market, where you’ve got ad platforms that are using Hadoop as part of the advertising matching process. Hadoop is the engine that’s driving the revenue creation for those businesses.
There’s some analysis being done, but it’s also this kind of operational intelligence, if you will, that’s being applied. You can see that on fraud detection as credit card companies are reacting in near real-time and discovering patterns that lead to fraud and then attacking those patterns in an earlier phase and catching fraud before it’s committed.
You can see it in some Web 2.0 companies that are building the business around Hadoop. Hadoop is the engine underneath things like Ancestry.com, which does the whole family relationship matching and they even have a DNA matching service now. And you know, Hadoop is kind of core to that processing. So it’s not just taking existing business intelligence or existing data warehouse processing and doing them faster or more cost effectively, it’s really opening up some of these really new possibilities that have, you know, dramatic impact on revenues.
Lawson: It’s my opinion that it’ll be 10 years before people explore what they can really do with it. What do you think?
Norris: I’ve been surprised at the speed of adoption. One of the hidden aspects of Hadoop and Big Data is that you don’t have to understand how you're going to use it before you deploy. That’s very different than any of the traditional data warehouse, analytic platforms that are out there today, because all of those are dependent on you understanding the questions you’ll ask and then how you’ll organize the data so that you can ask those questions.
With Hadoop, it’s so flexible that you can put data into Hadoop and you're not locked into some SQL construct. You're not locked into a certain data schema. So you can change very rapidly the granularity that you do things at, the dimensions that you do, the types of analytics. So it’s very, very fast to deploy and actually get value.
We had a customer who rolled out a major new financial service within the next quarter after they adopted. That’s unbelievably fast compared to the time horizon of traditional BI and data warehousing.
So it’s surprising how fast Hadoop impacts the business and then how additional applications can be rolled out on that same platform.