My husband and I took dance lessons for a year once. We never mastered any particular style, but we did learn just enough to make us dangerous on the dance floor, no matter what the band plays.
That’s because, for any given dance, there are three or four basic steps that form the foundation. Learn a few additional turns, add a bit of flair, and you’re ready for whatever the wedding season may throw your way.
Likewise, if you’re going to start a Big Data project, there are a few foundational steps to success you should know. While there’s a lot of advice about starting or succeeding with Big Data, much of it is actually about data management in general.
That’s fine — you’ll need those skills, but since they apply to any data project, they can’t really be called the essential — or, if you prefer, the quintessential — steps specific to Big Data.
That’s why I really like this recent article, “Big data: What’s your plan?” published in the McKinsey Quarterly. It does the best job of pinpointing three essential steps to Big Data success.
“Many companies fail to complete this step in their thinking and planning—only to find that managers and operational employees do not use the new models, whose effectiveness predictably falls,” the article notes.
Information Week recently published a piece on Intel’s use of Big Data, and it actually offers a great example of these three essential steps to Big Data success. Ron Kasabian, Intel’s general manager of Big Data solutions for Intel’s data center group, said the company realized it had unleveraged enterprise data, including data on various tests that ran during the manufacturing process.
Intel brought all the historical test data into Hadoop, (step one: gather and integrate the data) and analyzed it using predictive analytics (step two: apply an advanced analytics model). As a result, the company cut back on tests and saved $3 million in manufacturing costs the first year. It’s expected to result in another $30 million savings this year.
But in addition to changes in the manufacturing process itself, it used Hadoop to create a new security platform that uses data from network intrusion devices. Hadoop is used to process it, but Intel the extracts the relevant data and loads it into a massive parallel processing database. Its security team then uses that database to monitor the network for unusual behavior (step three: tools that make the data actionable).
What about the rest of those Big Data steps, like data quality, data governance and so on? Well, I’m not saying you can ignore any of that. But you can apply these disciplines to any data dance; they’re the turns and flair that you’ve already learned, the techniques that make the data better. Just apply them to the three essential Big Data steps, and you’re ready for the dance floor.