One of the challenges with Big Data is how to find value hidden in all that volume. Experts generally recommend approaching it as an explorer rather than simply querying the data to find specific answers.
As an astrophysicist, Dr. Kirk Borne knows a thing or two about probing the unknown. Borne, professor of Astrophysics and Computational Science at George Mason University, began tinkering with large data sets because of science, but soon became an advocate for Big Data. Now, in addition to his work as a professor and astrophysicist, Borne is a transdisciplinary data scientist.
According to Borne’s May post for the MapR.blog, he has identified four major types of Big Data discoveries (data-to-discovery, he terms it):
- Correlation discovery, which is uncovering hidden patterns and data trends
- Novelty discovery, which is when you find anomalies, outliers and other surprises in the data
- Association discovery, which is finding the “unusual, improbable co-occurring features or products in the data set”
- Class discovery, in which you find new categories and classes of items, events or behaviors
In this month’s Inside Analysis column, Borne goes one step further by explaining what you need to do to reach these “data-to-discovery” moments.
Regardless of your goal, your first step should always be to extract actionable features from the data. “Actionable features” encompass the data content as expressed through patterns; the data context, including sources, users, channels and metadata; and third-party information, which might include parameters and features from other data sources.
In other words, you need to fully understand the content, the full external and internal context, and any hidden patterns that might be lurking in the data.
Surprisingly, you can extract those features in multiple ways, including using what’s generated by business analysts and crowdsourcing. Borne’s main point, though, is that you can’t achieve tangible, actionable knowledge from your Big Data without first identifying these “actionable features.”
“After these actionable features are created, collected and curated, then the business of discovery, decision making and value creation through Big Data analytics can accelerate,” he said. “The resulting synergy of these activities leads to improved training sets, more accurate predictive models, fewer false positives and negatives, and more efficient and effective human interactions (with your users, clients and customers).”
Borne’s Insight Analysis column also provides observation into how Big Data is changing the CMO’s job and creating two new CXO positions: chief data officer and chief data scientist.