In which Jill wonders yet again how much size really matters.
It's always interesting to hear somebody dismiss a trend.
"That's not new!" he might say as he strokes his beard, lights a pipe, and mixes himself a Manhattan. "I worked on that stuff my first job out of college, for cripes sake!" Then he flips the Peter, Paul and Mary record, remembering the good old days of bra burning and punch cards.
And so it is with the newest trend, Big Data. High-tech companies looking for more efficient ways to process and store their Web transactions are often credited with lighting the Big Data fire. Big Data represents the collision of data warehouse, search, visualization and storage worlds, and it brands the conundrum we've been facing (and largely ignoring): Information is hitting companies at a faster rate than ever, and incumbent technology solutions are often too cumbersome or expensive to solve the problem.
When it comes to Big Data, most of my clients are still in research mode. As their advisor, I'm bound to ask them that trite-yet-requisite management consulting level-setting question: "What's the need, pain or problem you're trying to solve?"
Often clients explain that they need to treat transaction data differently than they need to treat, say, customer master data. Fewer business rules, more history - that kind of thing. That's when we start the work of classifying different data domains according to varying business policies:
Figure 1: Establishing Data Categories
Data classifications can get quite detailed, and there can be many categories. But if you've designed your data governance program the right way, you should be able to apply guiding principles to each category.
This strategy can then be used to gain consensus around optimal data management tactics, business rules, provisioning processes and, yes, technology for each category. Maybe that technology includes grid arrays or Hadoop. Maybe you'll realize you don't need new technology for a given category. Either way, you're circumscribing a taxonomy for your data. That's when the realization hits that the size of the data doesn't matter as much as how you use it.