Data Integration Is Borg and Other New Rules

Loraine Lawson

When I first started writing about integration, I found it surprisingly difficult to define. Other tech topics are much more cut and dry. For example, when I covered security, it was pretty clear what that entailed: viruses, security sign-ons, firewalls, encryption-all of these things are fairly straightforward. There's no using virus detection as encryption, no confusing firewalls with anything else.


Integration wasn't-and isn't-so obvious. I struggled to understand the difference between data integration, aka "DI," and application integration if both involve data, and why EAI was out of vogue and is an ESB really for integration. And then at the time there was this whole question of what the heck was SOA and what did it mean for integration.


Over the years, I've still wondered whether I push the bounds of what's correctly considered integration when I write about data quality, data federation, deduplication and so on. All this time, I figured I was the problem, but it turns out, I'm not the only one who found it difficult to pinpoint DI's boundaries.


A new TDWI report, "Next Generation Data Integration," points out that the data integration market has been in a state of expansion for a while now:

Data integration has evolved and grown so fast and furiously in the last 10 years that it has transcended ancient definitions. Getting a grip on a modern definition of DI is difficult, because 'data integration' has become an umbrella term and a broad concept that encompasses many things.

As you can probably guess from the title, this report is TDWI's bold attempt to redefine data integration in light of the new technologies, disciplines and issues it encompasses. The report identifies 10 "rules"-really, they're more like characteristics-that define next-generation data integration:


  1. "DI is a family of techniques." This means you need to stop thinking of DI as simply ETL or data replication.
  2. DI encompasses hand-coded solutions, vendor tools and a combo of both-although it's interesting to note that more companies are shifting from hand coding as the standard to using a data integration tool and tweaking it with hand-coding. That's a major shift, since vendors have been telling me for years their main competition isn't each other, but in-house or outsourced hand coding.
  3. DI practices have escaped the data warehouse and now span both analytics and operations.
  4. "DI is an autonomous discipline." Good news for those of you who can do data integration, because what TDWI basically means here is that there's more DI-specific work, which means more demand for DI skills, plus more DI teams and competency centers.
  5. DI is Borg. Okay, so TDWI didn't use those exact words. "DI is absorbing other data management disciplines," they wrote. Among the disciplines that are falling under the DI umbrella are data quality, MDM, replication, data federation, etc.
  6. DI is "broadly collaborative." This means you've got to work with other teams, including DBAs, operations, whoever manages the message/service buses and other data workers.
  7. "DI needs diverse development methodologies." In other words, the requirements for DI development is changing and you need to adopt to be more collaborative, lean and agile.
  8. DI must work with a broader range of interfaces, including cloud, data services and SOA.
  9. Data integration must scale, because the data is growing and so are the ways it's accessed.
  10. DI requires architecture-and if you don't think DI has any architecture, then you should definitely read the rest of the report, which actually gives you specifics about the different types of DI architecture.


Much of the 30-plus-page report focuses on a survey based on responses from 323 respondents, with vendor employees and academics excluded. It was conducted last November. What I appreciate about this survey is it doesn't just ask surface-level questions, but it drills down with related questions to give you better insight into an issue.


For instance, the survey found that most don't use a large percentage of their data integration tool functions. The average DI shop used approximately 40 percent of the tool functionalities. But TDWI also inquired about their planned use of these tools three years from now and found that most plan to increase the function usage to approximately 65 percent of the tool's capabilities.


Take your time with this report. There's a lot of information packed in this whitepaper and some of it may require slow digesting, especially if you're not an DI analyst or consultant. But it's worth it, because there's a lot here that will come in handy as data moves out of the data warehouse and into the hands of business users.


If you're short on time, you can find an archived webinar on the topic by Phillip Russom under TDWI's webinars.

Add Comment      Leave a comment on this blog post

Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.