Big Data's Integration Hurdles

Loraine Lawson
Slide Show

Big Data Analytics

The first steps toward achieving a lasting competitive edge with Big Data analytics.

"Big data is a big deal," analysts from McKinsey & Company wrote in a recent Financial Times article that draws its facts from a 156-page McKinsey report studying Big Data's potential in five sectors.


Yesterday, I posted a brief overview of the free report and some of its key conclusions. Today, I'd like to point out a few of the integration issues you'll need to consider when it comes to Big Data.

First, it's important to note that when the McKinsey analysts and others talk about Big Data, they're really not just talking about massive amounts of data, as Forrester Principal Analyst Brian Hopkins pointed out to me in the comments of last week's post about Big Data and integration:

We are thinking about it in terms of not only big volume, but high velocity, variety and variability. Some of the most interesting uses of technologies such as Hadoop are coming from the velocity and variability characteristics of "data at an extreme scale" - which is perhaps a better thing to think of when you hear the words Big Data.' What we are seeing is that it's not about just handling large amounts of complex data - agree, we have been doing that for years, as you point out. It's more about handling it in ways that are faster, cheaper and more forward looking that our current technology allows.

The McKinsey analysts take essentially the same view, as they explain in the FT article (free registration may be required for access). The article states:

In addition to the sheer scale of big data, the real-time and high frequency nature of the data is also key. For example, nowcasting' is used extensively and adds considerable power to prediction. Similarly the high frequency of data allows users to test theories in near real-time and to a level never before possible.

Obviously, this focus on real time and high frequency requires a more robust integration plan than your run-of-the-mill ETL project.

Not that ETL is useless. Actually, it makes the list of "Big Data Technologies" included in the free McKinsey report, starting on page 31. In fact, you'll most likely recognize a number of the technology items listed, including metadata, mashups, data marts, data warehouses and cloud computing, as well as the names you typically associate with Big Data, such as Hadoop, MapReduce, R and Cassandra. Big Data brings together a veritable "what's what" of integration and data management tools.


Again, many of the integration technology issues associated with Big Data are also not unique - they're just required on a uniquely large, fast and frequent scale. Legacy systems, incompatible standards and formats are all among the data integration challenges you'll encounter with Big Data. Again, that's nothing new.


But Big Data will require integrating a broader range of data, according to the FT article. This is not your typical B2B fare:

Above all, access to data needs to broaden. Increasingly companies will need to access data from third parties and integrate them with their own but today there are few areas where there are efficient markets for the trading or sharing of data-for example, there is no market for the sharing of the aggregate movement patterns derived from mobile phones that retailers want to mine as they try to understand the behavior of their customers.

In many cases, we're talking about large data sets owned by governments, some of which currently aren't even available yet.

Obviously, Big Data will be a big deal, creating not just new business and cost-saving opportunities, but potentially redefining democracy and open government. But all of that potential hinges on our ability to integrate, mine and use the data. McKinsey's report offers a good starting point for turning Big Data's potential into a reality.

Add Comment      Leave a comment on this blog post
May 17, 2011 9:09 AM Julianna DeLua Julianna DeLua  says:

Without integration, Big Data is just Big.  Fortunately, we are witnessing a sea change in the types of conversations that we are having with our customers, reflecting the types of dialogs that the executives are having about data.  Organizations are keen on turning Big Data into Big Opportunities. They are embracing these key trends, Hadoop, social media and volume, but are deeply focused on value. Socialization of data -- sharing data described above --- is a key trend, not only within an organization but also across and beyond an organization.  The ability to share data is helping democratize and make an organization more adaptable.  Use of social data for all type of operations will be touched upon during our online event.  We love to hear from you.

May 18, 2011 11:28 AM Loraine Lawson Loraine Lawson  says:

The report is now available in e-reader format.


Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.