SHARE
Facebook X Pinterest WhatsApp

The Need for Speed: Moving Big Data Quickly Can Challenge ETL

There are a lot of mixed messages when it comes to moving Big Data with ETL processes. I’ve been told by many Big Data experts that ETL is perfectly capable of moving data into and out of Big Data solutions. But recently, I’ve ran into several pieces that suggest ETL tools are slowing down the […]

Written By
thumbnail
Loraine Lawson
Loraine Lawson
Nov 21, 2012

There are a lot of mixed messages when it comes to moving Big Data with ETL processes. I’ve been told by many Big Data experts that ETL is perfectly capable of moving data into and out of Big Data solutions. But recently, I’ve ran into several pieces that suggest ETL tools are slowing down the process.

For example, it’s been noted in at least two recent pieces — including this DataMigration Pro article I shared last week — that there’s so much data now, using ETL to migrate data takes you much longer than the tradition “big bang data migration weekend” IT has relied upon in the past.

David Loshin (@davidloshin) joins those who say ETL and the whole batch-approach may be an outdated approach for some data needs. Loshin is the president of Knowledge Integrity, an information training, consulting and development firm.

“Now pervasive and right-time analytics seems to be within reach, but the batch-oriented approach is insufficient to meet today’s — let alone tomorrow’s — data integration and delivery needs,” he writes in the November issue of TechTarget’s BI Trends + Strategies. “Without addressing the challenge of data latency, data provisioning will continue to be the biggest bottleneck to increased productivity and accurate business decision making.”

He’s talking about data integration in two particular situations, mind you — business intelligence and analytics and Big Data.

There are different ways to solve the problem, of course. He suggests high-speed data replication technologies (which I guess would include in-memory tools, another option I’ve seen discussed) and caching techniques like those used in data federation or data virtualization.

I should add that ETL vendors are not ignoring this problem, either. Informatica and Pervasive are both among those with ETL that now offer high-speed ETL tools for Hadoop.

It’s a great piece that also looks at the ways data latency can costs businesses — including how it slows down development cycles for analytics applications.

The article is published as part of an e-zine (aka, a PDF) and starts on page 10.

You might also want to take some time to read or skim through the article right before it, “Data Stewardship Programs Need Solid Plan, Firm Focus.” It reviews five common mistakes companies make when starting data stewardship programs, including the challenge of finding the right people.

Recommended for you...

Top Managed Service Providers (MSPs) 2022
Observability: Why It’s a Red Hot Tech Term
Tom Taulli
Jul 19, 2022
Top GRC Platforms & Tools in 2022
Jira vs. ServiceNow: Features, Pricing, and Comparison
Surajdeep Singh
Jun 17, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.