The Value of Integrating Unstructured Data

Share it on Twitter  
Share it on Facebook  
Share it on Linked in  

It seems to be an unquestioned assumption that unstructured data is valuable -- if only companies could find a way to harness its power for the corporate good.

I've always wondered about that assumption. Sure, maybe the rogue Excel spreadsheet forecasting the 10-year profit margin of widgets would be a good thing to hunt down. But there's also a lot of garbage out there that might be better off gathering electronic dust in the nether regions of a desktop PC. And I couldn't help but wonder if the time and effort it takes to get the Excel spreadsheet would be worth it. I mean, if it's really worthwhile, wouldn't people have the sense to store it where it would be found and used?

Apparently, that view was a bit optimistic, considering the findings of a recent Aberdeen Group whitepaper.

Aberdeen surveyed companies about integrating unstructured data and found that the best-in-class (the top 20 percent) reported:

  • Better response time to customer demand,
  • Improved employee productivity,
  • Reduced risks of harmful events, and
  • Better insight into customers than their counterparts

I suppose I shouldn't be surprised. After all, it's estimated that unstructured data in the form of e-mail, text documents, web logs, spreadsheets and field notes account for 85 percent of an organization's data.


This is, by the way, an excellent whitepaper, full of surprising and useful revelations. For instance, you might think the main drivers for all companies would be focusing on unstructured data would be regulatory compliance or customer service. You'd be wrong. The main drivers for best-in-class companies are increasing employee productivity and reducing the risks by preventing harmful events. I wonder if that plays a role in their success, since these two reasons could deliver more obvious ROI than, say, "improving customer service" or "addressing compliance and regulatory issues."


It'd be tempting to read the results of integrating unstructured data into the corporate BI systems and think, "Wow, we've got to budget for that." But Best-in-Class companies do not necessarily budget for unstructured data integration as a line item, per se. Instead, they consider it a high or top priority (80 percent), but focus on BI-related applications to give structure to this data.


On page 9 of the report, you'll find Figure 4: Strategic Actions that Best-in-Class Prioritize. This one chart answers a lot of questions about which steps should come first if you want to unstructured data integration projects to succeed, because it shows how best-in-class prioritize their BI efforts as compared to the industry average. The one step shared by the most best-in-class companies? Defining data at its source.


The report also looks at what actions, capabilities and technologies best-in-class companies use. I noted the technologies don't differ that much among the categories, though their penetration in the class does. For instance, 12 percent of laggards use visualization tools, while 54 percent of best-in-class use visualization tools.


The really great thing about this report is in Chapter Three, which outlines the steps you should take next based on whether you fall in the Laggard, Industry Average or Best-in-Class categories.


One technology used by best-in-class companies is master data management. Although the paper doesn't give percentages, this item caught my eye because I recently interviewed Karen Leightell, senior product manager for IBM's Master Data Management Solution Group, about MDM and how it works to give companies more accurate, deeper insight into their customers, products and partners. We didn't talk specifically about unstructured data -- and MDM doesn't seem designed to address that -- but I can see how it would fit together with the priorities driving best-in-class companies.


One item the whitepaper didn't spend a lot of time on is capturing Web 2.0 data, probably because this is a new priority for best-in-class companies: 32 percent report they currently pull unstructured data from Web 2.0 technologies, but 52 percent report that they "plan to do so." (They might want to read about illumio, an unusual solution to this problem that relies on employees pushing this information.)


The 23-page whitepaper is available for free on Aberdeen's web site, though you'll have to fill out a registration form (of course -- when do you not have to fill out a form to get something "free?").