Newsletters Welcome, Guest Log In | Register


Join the Community

Exchange

Get full access to our community's expertise and resources.

Register Now >

Currently Being Moderated

Definitions: ETL

1

Created on: Jan 27, 2009 11:02 AM by Loraine Lawson - Last Modified:  Mar 30, 2009 12:21 PM by Loraine Lawson

Definition

ETL is short for extract, transform and load. ETL extracts data from one or more data sources, transforms the data from its previous form into a form usable by the target database or datawarehouse, and finally, loads the  data into that target location.

 

Business Applications

Traditionally, ETL is widely known as a tool for moving data from multiple databases to a data warehouse. However, in 2008, Bloor Research Director Philip Howard cited research showing there are four use cases for ETL tools:

 

  1. Data migrations and conversions, which, with data warehousing, compose two-thirds of all ETL projects.
  2. Application synchronizations, such as moving data from an ERP to a CRM system.
  3. Business-to-business exchanges for converting SWIFT, HIPAA and other messages.
  4. Providing data services for service-oriented architectures.

 

In addition, ETL tools  generally also can perform data cleansing. ETL is typically a tactical deployment, but it can be used strategically, as evidenced by this case study of a solution developed by IPS-Sendero, a software-development company and professional services company  that focuses on corporate performance management.

 

Deployment Consideration

Ralph Kimball, founder of the Kimball Group and author  of "The Data Warehouse ETL Toolkit," noted that while ETL stands for  three steps, the best practice for ETL systems in most data warehouses actually  requires 34 subsystems, which he categories into four major components: extracting, cleaning and conforming, delivering and managing.

 

The point is, ETL is not as simple as it may seem.  While you can buy ETL tools, expect to spend some time addressing issues of data quality.

 

Also, ETL  solutions may sound similar on paper, but in practice, they perform differently, so it's advisable to identify your technical criteria and test products against these before you invest. Here are some questions to consider:

 

  1. Do you need support for Web services?
  2. Do you need an XML-based tool?
  3. How scalable must the tool be?
  4. Will you need to repurpose the tool within the organization? If so, what is the cost per project?
  5. Will you be embedding the ETL engine and distributing it?
  6. Can you do a trial run?  Most ETL the tools are too complex for a  proof-of-concept, but some companies do offer short-term licenses for single  projects.
  7. How does the tool perform? Does it run the required transactions at the speed you need?
  8. What will the total cost of ownership be?

 

Emerging Changes to ETL


While ETL remains a standby, some companies are replacing ETL with newer, alternative integration tools. Pfizer Global Research and Development deployed  data-integration middleware to eliminate ETL projects. As a result, research  and development teams were able to gain access to data within a week, rather than the three- to four-month timeline required by IT to run new ETL jobs.

 

For the most part, however, traditional ETL vendors face competition from  next-generation ETL tools, such as expressor, which uses a semantic metadata repository. These new competitors, and established vendors' response, are  explored in this Enterprise Systems article.

 

Some predict master data management might also emerge as a competing solution to ETL.

 

Related Knowledge Network Content

 

Average User Rating
(0 ratings)




Add a comment Leave some feedback about this document.
Guest Julianna DeLua  says:

A great point on how ETL has evolved over time.  We are deploying the expanded ETL capabilities, or complete data integration solutions to manage data warehouses. Data quality management and real-time data integration are becoming mandatory, instead of being nice-to-have. We also see organizations leveraging ETL or data integration solutions to ensure the accuracy and consistency of master data in the MDM stack.  Extending the data warehouse with industry standards such as SWIFT or HIPAA is not only a technologically sound decision, but also a business-savvy way of leveraging the existing environment.  To find out more, visit us at http://www.informatica.com/solutions/enterprise_data_warehouse/Pages/enterprise_data_warehouse_solution.aspx

 

 

 

All About Reducing Your IT Costs

Looking to cut costs? Use this research-driven Excel tool to pinpoint which IT cost reduction measures best fit your needs.

Learn more >

Disaster Recovery & Business Continuity Template Pack

Prepare your company for any type of disaster you can envision and those you cannot. Immediately download this comprehensive set of templates and tools for documenting your business contingency plans.

Learn more >

Data Deduplication

Data manipulation strategies that make data stores more manageable and reduce the need for storage capacity and its associated costs.

Data Warehousing for Business Intelligence

Comprehensive storage solutions for better data access and retrieval, leading to better-informed business decisions.

Virtualization & Business Continuity

Virtualization solutions, management tips and industry insights to promote and insure the lifespan of your business.

Enterprise Manager

Tools, best practices and expert advice on managing your enterprise IT infrastructure, databases, and Web service components.

Lowering Your IT Costs with Oracle Database 11g Release 2

This white paper identifies the key capabilities a database management solution needs to successfully deliver more information with higher quality of service, make more efficient use of IT budgets, and reduce the risk of change in data centers.

Software Forum: Information On Demand Virtual Experience

This interactive virtual forum presents leading IT experts providing the insights you need to turn your information into a strategic driver for innovation, business optimization and competitive differentiation.