Big Data Analytics
The first steps toward achieving a lasting competitive edge with Big Data analytics.
"Big data is a big deal," analysts from McKinsey & Company wrote in a recent Financial Times article that draws its facts from a 156-page McKinsey report studying Big Data's potential in five sectors.
Yesterday, I posted a brief overview of the free report and some of its key conclusions. Today, I'd like to point out a few of the integration issues you'll need to consider when it comes to Big Data.
First, it's important to note that when the McKinsey analysts and others talk about Big Data, they're really not just talking about massive amounts of data, as Forrester Principal Analyst Brian Hopkins pointed out to me in the comments of last week's post about Big Data and integration:
We are thinking about it in terms of not only big volume, but high velocity, variety and variability. Some of the most interesting uses of technologies such as Hadoop are coming from the velocity and variability characteristics of "data at an extreme scale" - which is perhaps a better thing to think of when you hear the words Big Data.' What we are seeing is that it's not about just handling large amounts of complex data - agree, we have been doing that for years, as you point out. It's more about handling it in ways that are faster, cheaper and more forward looking that our current technology allows.
The McKinsey analysts take essentially the same view, as they explain in the FT article (free registration may be required for access). The article states:
In addition to the sheer scale of big data, the real-time and high frequency nature of the data is also key. For example, nowcasting' is used extensively and adds considerable power to prediction. Similarly the high frequency of data allows users to test theories in near real-time and to a level never before possible.
Obviously, this focus on real time and high frequency requires a more robust integration plan than your run-of-the-mill ETL project.
Again, many of the integration technology issues associated with Big Data are also not unique - they're just required on a uniquely large, fast and frequent scale. Legacy systems, incompatible standards and formats are all among the data integration challenges you'll encounter with Big Data. Again, that's nothing new.
But Big Data will require integrating a broader range of data, according to the FT article. This is not your typical B2B fare:
Above all, access to data needs to broaden. Increasingly companies will need to access data from third parties and integrate them with their own but today there are few areas where there are efficient markets for the trading or sharing of data-for example, there is no market for the sharing of the aggregate movement patterns derived from mobile phones that retailers want to mine as they try to understand the behavior of their customers.
In many cases, we're talking about large data sets owned by governments, some of which currently aren't even available yet.
Obviously, Big Data will be a big deal, creating not just new business and cost-saving opportunities, but potentially redefining democracy and open government. But all of that potential hinges on our ability to integrate, mine and use the data. McKinsey's report offers a good starting point for turning Big Data's potential into a reality.