SHARE
Facebook X Pinterest WhatsApp

Big Data: Think ‘Explorer’ Rather than ‘Scientist’

There’s a lot of talk about the data scientist since Big Data came on the scene. But when it comes to gaining truly valuable information from Big Data, you may be better off focusing less on the science and more on exploration. Hypotheses are a central part of science, and thus far, that’s been a […]

Written By
thumbnail
Loraine Lawson
Loraine Lawson
Nov 27, 2012

There’s a lot of talk about the data scientist since Big Data came on the scene. But when it comes to gaining truly valuable information from Big Data, you may be better off focusing less on the science and more on exploration.

Hypotheses are a central part of science, and thus far, that’s been a major part of how we use data, according to Jill Dyché, the vice president of Thought Leadership at SAS and a veteran data management expert. Business reports, for instance, are usually strongly driven by hypothesis of one sort or another, she points out in a recent Harvard Business Review column. It’s available with a free HBR blog registration.

But when it comes to Big Data, the best and most profitable findings are the result of what she calls “low-hypothesis exploration.”

Explorers usually have only a vague goal in mind: Find a new route, be the first to visit the North Pole, discover the Fountain of Youth. In many ways, this gives them the upper hand when you need new knowledge and discoveries.

This isn’t the first time I’ve heard that it’s better to approach Big Data as an exploration. And it makes sense, when you consider that much of the data is text-based or weblogs and so big, it’s very hard to predict what it’s going to tell you.

Dyché gives several examples of real Big Data discoveries that were found through exploration rather than more traditional reports. One company managed to increase its per-shopping-cart revenue 16 percent in one month, thanks to this “knowledge discovery” approach, she writes.

But what impressed me is the example of Stanford University researchers, who used this approach on breast cancer research and learned that non-cancerous cells also contribute to cancer cell growth.

Of course, this approach also yielded what I see as one of Big Data’s more questionable (thus far) findings: A commercial lines insurer “team found that ‘loose affiliations’ with low-income friends was an indicator [of] a higher propensity to file fraudulent claims.”

That, too, was found using an informal approach to data exploration, and it may show the weakness in this approach. Because while the finding may be legit — and far be it from me to question these people — it does seem to open up questions of observation bias, as well as issues about what’s actionable. It strikes me as a somewhat legal and ethical murky area when you think about using it as actionable data. (Should I automatically be subject to more audits because I have more low-income friends than you? Would that open the company up to legal actions?)

For the most part, though, taking an open-minded approach to exploring Big Data has lead to some very concrete, positive findings.

Here’s the interesting part, though: This may actually be the best argument I’ve seen for keeping Big Data in silos from existing systems.

“Running discovery trials on big data should be a continuous process, where the results may feed more traditional business intelligence or drive additional discovery tests,” Dyché writes. “Sometimes this means isolating big data efforts from traditional analytics programs where delivery processes and organizational roles are already entrenched.”

Check out the full piece, which explains why it’s hard for companies to take an “exploring” approach to any data and how you can change that.

For more on this topic, you might also want to download Nov. 27’s recording of “The Briefing Room.” I personally haven’t had a chance to listen to it yet, but it was promoted as a discussion on how to explore Big Data and featured veteran EMA Analyst John Myers, as well as a briefing on Big Data analytics vendor Alteryx.

Recommended for you...

Top Managed Service Providers (MSPs) 2022
Observability: Why It’s a Red Hot Tech Term
Tom Taulli
Jul 19, 2022
Top GRC Platforms & Tools in 2022
Jira vs. ServiceNow: Features, Pricing, and Comparison
Surajdeep Singh
Jun 17, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.