Why Data Integration Is Still (And Will Be) A Problem

Loraine Lawson

I admit it. I'm not always a patient person, which is kind of odd considering I have such a high pain threshold for really dull things like planning and zoning meetings and tech white papers.


But hand me something that won't work right, and I've got about a 20-minute (on a good day) limit for tinkering with it: Just ask my very patient, methodical husband.


I mention this so you have a context for what I'm about to tell you. When I first started covering integration, after about 20 posts, I became really irritated at the whole problem. Why in the world were we still dealing with this whole application/data silo issue, I wanted to know. It's not like this is a new issue. It's not like tech doesn't know about the problems a lack of integration creates. So why hasn't this been fixed?


I've received various answers over the years, but I can tell you none was as satisfying or as complete as this recent post by Robin Bloor, president of the Bloor Group and a partner with Hurwitz and Associates.


Of course, Bloor wasn't answering my question, per se, but the question inherent in his post, "10 Reasons Why Data Integration Is Hard," is close enough.


Not surprisingly, there are a lot of factors at play. Many data problems have their root in the applications themselves, he writes: They were never designed with an eye toward sharing data, they keep changing, they don't agree on basic business fundamentals such as how to calculate sales commissions and cut-off dates, etc. Essentially, those who develop applications leave the data to someone else, specifically you, the customer, he writes:

Bright though these software engineers are, they do not standardize on data definitions or data access. Data integration is "left as an exercise for the customer." It's an exercise the customer never completes.

That's another thing that's always bothered me: Things that don't end, like laundry, dirty dishes, and, yes, the need to exercise. As a nation, we stink at anything exercise-related, so you can see why that word works on many levels for describing the problem of data integration.


You're probably not surprised to learn applications are part of the problem, but there are a few surprises in the list. For instance, he points out that one thing we do that makes data integration hard is we keep coming up with integration systems, like Enterprise Information Integration products-which, he notes, weren't really integration systems but a means of circumventing slow data warehouses. Master data management and complex event processing are yet more data-integration systems. We don't need that many, he writes. One will do.


I admit I was also surprised to see what he had to say about service-oriented architecture making it all worse. I knew it didn't solve data integration, but I was under the impression it simplified application integration and that the residual impact would be easier integration across the board. Not so, says Bloor:

It (SOA) helped integrate applications at the business process level and it enabled the reuse of software more effectively than had been possible before. However, all of that application integration simply made matters worse at the data level. SOA never helped much with the integration of data. It just made the data anomalies that existed more visible and more troublesome.

He also discusses the problems of data warehouses, why data is so gosh-darn dirty, and the problems and potential of metadata. I loved this post and heartily recommend it, particularly for those among your business colleagues who share my impatience.


In light of Bloor's recent post, it'd be fun to attend tomorrow's (Oct. 12) free DM Radio briefing on the future of the data integration space, hosted by Bloor and featuring consultant Rick Sherman. Sherman is best known as the Data Doghouse blogger. He'll be discussing the sharing of metadata, which Bloor identified as one of the missing components of long-term data-integration success.


The webcast will also include Talend, an open source company with data-integration and MDM solutions. The event starts at 3 p.m. ET.

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


More from Our Network
Add Comment      Leave a comment on this blog post
Oct 13, 2010 2:12 PM Bob Potter Bob Potter  says:


I have read Robin's post and think that it is an excellent piece.  I agree with all 10 of his reasons why data integration is hard.  I do think there is an 11th reason and I believe it is the most frustrating one of all:

In my opinion, no vendor besides my company has truly simplified data mappings or made them reusable.   Why is it that every project requires the same columns, fields and files to be remapped based on the physical names, schema and layout of the sources and targets?  An enormous amount of time is wasted by expensive resources doing n to n mapping of data, when they could have used the work done by the previous project teams. 

Companies need to build a canonical model of their data to speed up data warehousing and analytics projects and eliminate this tedious bottleneck once and for all.

Bob Potter, CEO expressor



Post a comment





(Maximum characters: 1200). You have 1200 characters left.




Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.

Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.