SHARE
Facebook X Pinterest WhatsApp

Better Data Integration with Hadoop? It’s Possible

More organizations are using Hadoop not just to process large datasets, but as a replacement for the transformation engines in ETL. But is Hadoop capable of being a data integration platform, complete with data quality functions? Gartner analyst Ted Friedman (@Ted_Friedman) thinks not. Friedman recently wrote a research paper, “Hadoop is Not a Data Integration […]

Written By
thumbnail
Loraine Lawson
Loraine Lawson
Feb 7, 2013

More organizations are using Hadoop not just to process large datasets, but as a replacement for the transformation engines in ETL.

But is Hadoop capable of being a data integration platform, complete with data quality functions?

Gartner analyst Ted Friedman (@Ted_Friedman) thinks not. Friedman recently wrote a research paper, “Hadoop is Not a Data Integration Solution,” on the topic. The description sums up his point:

“As use of the Hadoop stack continues to grow, organizations are asking if it is a suitable solution for data integration. Today, the answer is no. Not only are many key data integration capabilities immature or missing from the stack, but many have not been addressed in current projects.”

I haven’t read the paper, because I’m not a client and it’s $195, but Todd Goldman has. Goldman is vice president and general manager for Enterprise Data Integration at Informatica. He wrote a response to the paper.

 He says many companies are turning Hadoop into a data integration platform.

“Gartner is correct in that, Hadoop, by itself, is NOT a data integration platform,” Goldman writes. “However, it can be made into a data integration platform. Lots of companies are investing in making Hadoop based integration easier.”

Informatica did this by porting its Virtual Data Machine onto Hadoop, he adds, giving companies the same integration development environment they use for ETL jobs, with Hadoop as the underlying engine.

Not surprisingly, Informatica is not the only vendor investing in adding full data integration platform capabilities to Hadoop.

“The market in general is moving in this direction so expect to see some exciting capabilities emerging over the next six months,” he states, adding that there are companies already using a kind of graphical development environment with Hadoop — as opposed to hand-coding MapReduce jobs. Not surprisingly, they’re able to create code five times faster, he said.

Hadoop has already made it possible to run more complex transformations in substantially less time than traditional ETL tools. Some companies are even running sophisticated integration jobs, he adds, without hiring expensive data scientists or MapReduce specialists.

If you’d like to read more about Big Data integration, check out this Big Data integration piece by Richard Daley, industry veteran and co-founder of Pentaho. Daley looks at all the tools in the Hadoop stack and discusses supporting integration for other NoSQL solutions, such as MongoDB, Cassandra and HBASE.

Recommended for you...

Enterprise Software Startups: What It Takes To Get VC Funding
Tom Taulli
Aug 25, 2022
Top RPA Tools 2022: Robotic Process Automation Software
Jenn Fulmer
Aug 24, 2022
Metaverse’s Biggest Potential Is In Enterprises
Tom Taulli
Aug 18, 2022
The Value of the Metaverse for Small Businesses
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.