Newsletters Welcome, Guest Log In | Register


Join the Community

Exchange

Get full access to our community's expertise and resources.

Register Now >

Currently Being Moderated

Definitions: Unstructured Data Integration

0

Created on: Jan 27, 2009 11:40 AM by Loraine Lawson - Last Modified:  Mar 30, 2009 10:48 AM by Loraine Lawson

Definition

Unstructured data refers to data stored as text or rich media (bitmap) objects.

 

The opposite of unstructured data is structured data, although more recently, some analysts have begun to identify a third type of data: semi-structured data, which include more official Word ocuments, spreadsheets and other office suite documents.

 

Business applications and concerns

Eighty percent of all enterprise information is stored as unstructured data. Given that e-mail, Web logs, call center records, Word documents and spreadsheets are all “unstructured data,” it's easy to see how executives and managers would benefit from being able to reliably access and query this information.

 

A  2008 report by the Aberdeen Group showed best-in-class companies who integrate unstructured data reported:

  1. Better response time to customer demand.
  2. Improved employee productivity.
  3. Reduced risks of harmful events.
  4. Better insight into customers than their counterparts.

 

Best-in-class companies also reported that reducing risks by  preventing harmful events and increasing employee productivity were the top  drivers for pursuing integration of unstructured data.

 

In recent years, regulatory  compliance and data-security issues have forced many companies to act on the  problem of unstructured data.

 

The big challenge with unstructured data is to integrate it with more formal, structured data. For instance, very little unstructured data can be  accessed by existing business intelligence tools. If BI tools could draw  from both types of data, leaders would gain better insight into the business.

 

Deployment Options

There are a range of options for finding, storing and accessing unstructured data. Enterprise  search tools, enterprise content-management systems, text mining and analytic tools and intranets are among the solutions companies use to organize  unstructured data.

 

BPM tools have also been used to “bridge the gap” between structured and unstructured data. Geoffrey Weglarz, a veteran of relational database technologies, multidimensional database technologies and linguistics,  pointed out three specific situations where BPM had been used to marry  unstructured data with structured data in this 2004 DM Review.

 

In the past two years, text analytics tools have entered the data-integration market. Philip Russom, an analyst for The Data Warehousing Institute, explained in this IT Business Edge interview, that these solutions can analyze natural language and mine it for data that can be imported into database records. Pureplay vendors include Attensity, ClearForest, Clarabridge. Some search tools also  include text analytic capabilities, including Inxight, FAST and Endeca.

 

Colin White, the founder of BI Research, wrote in 2008 that the three main tools for integrating structured data - data federation, data consolidation and data propagation – could also be applied to unstructured data. Unstructured data would require an additional step of transforming the  necessary business information into a semi-structured format, such as XML, or a structured format. He explained the challenges to this approach and outlined possible solutions in this bEye Network article.

 

Emerging Solutions

There also is an emerging discipline –information management - devoted to the problem of integrating structured and unstructured data. A Computer Weekly article examined this emerging field, as well as existing integration options and solutions on the horizon.

 

Another emerging option is the use of semantic technologies to integrate unstructured data.

 

Related Knowledge Network Content

 

Average User Rating
(0 ratings)




Add a comment Leave some feedback about this document.

There are no comments on this document

Project Manager's Toolkit

Govern your IT projects using the latest project management standards including Prince2 and PMBOK. The Project Manager's Toolkit contains over 80 documents and templates that explain and guide you through today's highest standards of project management.

Learn more >

Budget & Finance Toolkit for IT - 2010 Edition

Download a comprehensive collection of templates, forms, instruction and advice that will help you to plan and submit your 2010 IT Budget.

Learn more >

Application Performance Management

Application delivery and performance tools for Web applications to insure high availability and productivity.

Greening IT with Server Consolidation

Learn how virtualization reduces the TCO of managing your date, while contributing towards your sustainability efforts.

Mobile Management

Answers to the ongoing challenges of the mobile office: to work anywhere, securely and efficiently.

Data Warehousing for Business Intelligence

Comprehensive storage solutions for better data access and retrieval, leading to better-informed business decisions.

The 11 Secrets of Business Rules Success

This white paper details the 11 secrets to business rule success, and explores how to balance speed and quality when building a rules-based approach to decision management.

Business Driven Access Management and Governance

Read this white paper to learn how an automated access request model removes traditional IT operations and security bottlenecks and improves overall security, resulting in a drastic reduction in IT-related cost, complexity, and risk.