One of the really challenging aspects of Big Data is that, often, you don’t know what you’re looking for until you’ve found it.
It’s pretty darned hard to identify a clean ROI for exploration, so you’re basically asking business executives to trust that the data will yield something valuable.
John Myers, research director for Enterprise Management Associates, calls this “discovery.” That’s actually a great name for it, because just as when an attorney files for discovery, it means you’re going on a fishing expedition and you want as few limits as possible put on your scope.
Or, as Myers puts it in this Information Management column, discovery is “literally the concept of attempting to answer the questions we don’t know that we don’t know without having to accurately describe a ‘lot’ or a ‘bunch.’”
The problem is, until now, we’ve lived in a very SQL-based world, which basically means we’ve been fishing in a stocked pond. Now, there’s nothing wrong with stocking a pond — you know you’ll probably catch something and you more or less know exactly what it will be.
But here’s the problem with stock ponds: If you want to fish for something more exotic and unusual, a stock pond won’t cut it. You’re going to need an ocean or at least a really large lake — and that’s the world you’re fishing in when you talk about NoSQL.
“SQL platforms by their very nature have a ‘box’ around the information they contain,” Myers writes. “In discovery, you don’t want to hit walls; you want to explore to your heart’s content, or at least until there is no more data to search, explore and discover. With NoSQL platforms, the walls tend to disappear.”
This, he argues, is the real value of unstructured data, and he offers some advice for how to maneuver in the NoSQL world.
NoSQL and Big Data aren’t the only reasons to change how you think about data, however. The Internet of Things, artificial intelligence and semantic technologies all will push IT and business users alike out of our small, stocked ponds and into the uncharted data depths.
For an idea of just how much our relationship with data is changing, you might want to check out a second Information Management article, “Narrative Science Applies AI to Suspicious Activity Reports.”
I don’t generally like to quote the same publication in the same post, but this example is too good to pass up. The article is about Quill, which uses artificial intelligence to analyze numbers and turn them into a natural language report.
In other words, not only are we exploring data in new ways; now the technology is also finding and extracting meaning from that data — without us. The article includes this quote from Narrative Science CTO Kristian Hammond:
“I believe within my lifetime I'll see a day when people will look at spreadsheets the same way we look at computer punch cards - as a mechanism for communicating with a machine and letting it communicate with us. That mechanism and that time are gone. By humanizing the machine, giving it voice, we can rid the world of an awkward and painful mechanism."
Hammond anticipates that it will be used to transform both data and report writing in banking, insurance -- and journalism.
It’ll be interesting to see whether people can keep up with this Brave New World.