Survey Looks at Open Source Data Integration Tools

Share it on Twitter  
Share it on Facebook  
Share it on Linked in  

Open source data integration tools are more widely adopted than open source Business Intelligence tools-despite the fact open source BI has been around longer, according to a recent report published by B-eye Network.


The report, "Open Source Solutions: Managing, Analyzing and Delivering Business Information," is based on research by Third Nature, which used a survey and interviews of IT professionals and consultants. More than 1,000 people filled out the survey, with most of the respondents were based in the North America and Europe.


If you're considering open source solutions for databases, business intelligence, analytics or data integration, you should definitely check out this free report. You'll find an assessment of how and where companies are deploying open source, as well as a look at the most common problems and hiccups they've encountered.


As it turns out, data integration solutions are surprisingly common and mature, despite the fact that many of these solutions are "younger" than, say, open source BI. Apparently, the fact they're more widely used has lead to a quicker maturing of the tools, according to the report.


While respondents said data integration products "could support all the basics needed in data integration projects," there are weaknesses in open source data integration solutions, the report notes, including:

  • Administration
  • Team support
  • Ability to handle advanced integration functions, such as data quality when dealing with semi-structured data


There's also a lack of, shall we say, complexity among open source solutions. Most are for single purposes-"like the early ETL tools," the report notes. The exception-once again-is Talend, which offers data quality and will soon offer an MDM solution- all of which may explain why Talend is the first and only open source offering to rank in Gartner's Data Integration Magic Quadrant.


The report also included this fun fact: It seems one reason open source data integration tools are popular is that they're used for operational data integration.


Operational integration projects have been a growing trend for some time, often competing with analytical projects for data integration resources, such as staff and tools. But it's easier to justify the expense of data integration with analytical projects, such as data warehousing, where it can constitute 80 percent of the budget and timeline, the report notes. But with operational data integration, like that required for business intelligence, it's traditionally been more common to use hand-coding -- even though experts have long said hand-coding is often uncessessary and can increase costs. Open source tools seem to be changing this dynamic, according to the report:

"Hand coding is common in application projects because data integration is thought of in terms of application glue. In BI projects, hand coding is most often a way to save money on the high cost of enterprise ETL products. Community open source data integration tools can provide the cost advantages of hand coding with the productivity advantages of traditional data integration software."

Talend is the most likely commercially supported product to be used for operational data integration. Other popular open source data integration solutions-all commercially supported -- are Pentaho, DI/Kettle and Jitterbit.


This report is a great read for understanding how open source data and BI solutions are being adopted, but it also includes short list of recommendations for those considering open source. I also found the discussion on the difference between "community" and "enterprise" editions helpful in avoiding or at least understanding one of the common gotchas of open source -- finding you have to pay if you want to unlock the more useful features.


You will need to sign up for a free membership to download the report.


For more reading on the growth of open source in the enterprise, particularly up the stack, you might want to check out Mike Vizard's recent post, "The New Economics of Open Source in the Enterprise."