The tricky part about data is learning to accept what it says without imposing your own agenda.
It seems Big Data is no exception — at least, when it focuses on traditional, structured data, according to a Harvard Business Review Blog post written by Prof. Theos Evgeniou, Assoc. Prof. Vibha Gaba and consultant/visiting professor Joerg Niessing of international business school, INSEAD.
“A large body of research shows that decision-makers selectively use data for self-enhancement or to confirm their beliefs or simply to pursue personal goals not necessarily congruent with organizational ones,” they write. “Not surprisingly, any interpretation of the data becomes as much an evaluation of oneself as much as of the data.”
You may be able to avoid this pitfall by worrying less about the volume of your data, and focusing on growing the variety of your data, they argue.
“Big is old – retailers and financial institutions have had big data for decades. But Diversity is new,” they say.
The business reason for adding diversity is simple: It illuminates links in behavior that traditional data simply can’t reveal.
Marketing is already proving this by linking data from in-store loyalty programs to data from public websites, such as car or movie sites — basically, anything with a cookie. Toss in social media data and some in-depth research, and it can reveal more about data than volumes of traditional customer data.
For example:
“A leading Telco company we have worked with was able to increase market share by more than 20 percent in some countries without increasing the marketing budget by leveraging behavioural and transactional data from social and general media.”
Professors aren’t the only ones who have realized that focusing too much on volume can restrict what you’re able to achieve with Big Data.
Yves de Montcheuil, Talend’s vice president of marketing, identifies focusing on volume as one of the five major pitfalls of Big Data.
“When dealing with big data management, forget volume, de Montcheuil writes. “No matter the quantity, it is important to go after the ‘right’ data and identify all the sources that are relevant.”
Beyond the typical social media and SaaS data, de Montcheuil suggests you focus on so-called ‘dark data.’ He identifies two types:
- “Exhaust data,” which includes data from sensors and logs that’s usually purged rather than stored.
- Public data, which includes social media and open data.
As you think through Big Data, you’re probably going to begin with what you know: relational data. From the start, though, you can plan to move toward data diversity and Big Data Management maturity, suggests an October TDWI Best Practices Report, “Managing Big Data.”
“You have to start somewhere, so start with relational data, then move on to other structured data, such as log files that have a recurring record structure,” the report advises. “Carefully select a beachhead for unstructured data, such as text analytics applied to call center text in support of sentiment analysis.”
Next, add in semi-structured data that may be mission-critical, such as procurement and other B2B transaction data in the form of XML documents.
“Diversity, if managed well, yields divergent thinking and the pooling of a broader base of knowledge results often in better strategic choices,” write Evgeniou, Gaba and Niessing. “The point we stress here is that diverse data confers similar benefits.”