How Data Virtualization Supports Integration

Loraine Lawson

For some time now, I've followed virtualization, particularly as it pertains to data. I could see how it would create a need for data integration-but I wasn't sure how it could be used as a data integration tool.

Apparently, I'm not the only one who needed help connecting the dots. This month, I've found two resources that help explain how data virtualization helps with integration.


The first is a TDWI-published paper that explains exactly how data federation-aka, data virtualization-is an important tool for data integration teams. It's written by the director of TDWI Research, Wayne Eckerson, and sponsored by Composite Software.


I know some of you have worked in integration since Noah came over on the ark; honestly, this paper isn't for you. But for those who hail from the business side or who are new to data integration, this paper is an excellent resource for understanding what data virtualization is and is not, as well as its technology heritage and the potential business use cases.


I found the history particularly enlightening. Data virtualization is positioned as a relatively new development, but data federation reaches back to the early 1990s, with the virtual data warehouse, according to Eckerson. By the early part of this decade, the technology was coupled with more robust computing resources, marketed for general-purpose data integration and labeled "Enterprise Information Integration." My fellow ITBE blogger, Arthur Cole, pointed out this fact last year when he posted an overview of data virtualization vendors.


These days, you'll find this technology marketed as data virtualization, data services or distributed query solutions, Eckerson writes. Of course, it's not just the name that's changed. The tools "have broadened their capabilities, Eckerson continues:


"They are used in a variety of situations that require unified access to data in multiple systems via high-performance distributed queries, such as data warehousing, reporting, dashboards, mashups, portals, master data management, data services in a service-oriented architecture (SOA), post-acquisition systems integration, and cloud computing."


The first checklist explains what data federation is and why vendors sometimes call it "data virtualization" instead:


"When users submit a query, data federation software calculates behind the scenes the optimal way to fetch and join the remote data and return the result. Its ability to shield users and application developers from the complexities of distributed SQL query calls and back-end data sources is why some vendors call this technology 'data virtualization' software."


The second resource is an ebizQ webcast, available for replay, featuring blogger and consultant David Linthicum and Bradley Wright, senior marketing managers for data services at Progress DataDirect. The discussion focuses on the business value of data virtualization, but as part of that, both Linthicum and Wright explain how virtualization supports integration.


I particularly liked Wright's definition of data virtualization as a "data consumption approach that integrates and transforms data from multiple data sources into a logical or virtual business-friendly data model that really hides the details of the physical sources and the data in those physical sources from the consumers and are also accessed on demand through some partiuclar API by those data consumers."


Wright also provides a concrete example of how several client companies have used virtualization, including a health care insurer whose call center representatives couldn't access information on a member without switching between multiple applications. They didn't want to migrate the data out of the existing systems, but thanks to data virtualization, they were able to present an integrated, single-view of the member without physically moving the data.


If you'll like to learn even more about virtualization, there are a lot of great resources here on IT Business Edge, including these free book excerpts:


Virtualization for Dummies, which looks at the business case for virtualization
Practical Virtualization Solutions, which offers a vendor-neutral look at virtualization technologies


Special thanks to Information Management for first referencing this checklist's data integration coverage.

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


More from Our Network
Add Comment      Leave a comment on this blog post
Nov 24, 2009 3:31 PM Robert Eve Robert Eve  says:

Loraine -

Wayne Eckerson and TDWI did the industry a good service with their TDWI Checklist on Data Federation.  Not only is the format clear and concise, it also updates the state of the art by describing what is possible using today's data virtualization middleware within modern IT infrastructures.

You are wise to point out the importance of specific, concrete customer use cases as a way to further this understanding. 

If readers are interested in real examples live at large enterprises today, look to http://www.compositesw.com/index.php/solutions/ where over twenty common use patterns are described, each with one or two customer examples specifically referenced.

- Bob

Feb 11, 2011 2:27 PM Steven Yaskin Steven Yaskin  says:


I've been following your blog with great interest for some time. I find it very informative and educational to a lot of CIOs and information management crowd. I actually include links to your posts in my own blogs and even face-to-face customer meetings I have daily.  As in any emerging technology field - sharing information about Data Virtualization is the most valuable service we can all provide in order to introduce its benefits.  I am happy to share our real world customer experiences around multiple implementations of Queplix Virtual Data Manager platform. We are seeing great success in adapting our object-oriented approach to data virtualization and providing information access continuity from simple integration all the way to the virtual master data management. Since our product spans all these industries and applies data virtualization to address a multitude of business problems in each, we have ultimate knowledge of how advanced data virtualization is growing and maturing. We are seeing a convergence in scope of data management tasks we are solving now for large F-500 organizations and smaller companies. With advanced data virtualization, a 100-people company now can extract real dollars from implementing a data quality solution in Queplix VMDM repository while a Fortune-500 company finds real sense in implementing social Blades like LinkedIn to enrich their corporate data. It's all coming together now thanks to advanced data virtualization. Departmental lines and political boundaries start to blur. Information starts to flow seamlessly and securely throughout large and small infrastructures. Still, the Queplix' philosophy is not to boil the ocean, as traditional ETL and early data virtualization products try to do with limited amount of success. We virtualize one application at a time, making a full use of our persistence Data Virtualization nature. Every day we see the advent of blazingly fast, no-SQL and minimum source disruption solutions in data virtualization and are happy to lead the pack. The age of heavy middleware, SQL generating, Data Warehouse producing tools is behind us. There is no need to cache the data, move volumes between repositories or draw the field mapping lines. Object-oriented data modeling proved to be the ground breaking paradigm when it was applied to Data Virtualization. Granted, a lot of companies have been using ETL and basic data virtualization for years. Queplix data virtualization solution greatly enhances existing ecology with high degree of automation and soon to be announced Dynamic virtual repositories based on Hadoop's Hive and NOSQL core. While Queplix completely eliminates the need to manage SQL in advanced data virtualization, we are going to do the same on the data consumption side and finally start executing towards the goal of making the heavy+monolithic datamarts and data warehouses a thing of the past.  There is a lot to talk about and share, and I am happy to update you readers periodically as Queplix rolls out new exciting Advanced Data virtualization modules every month now! You are welcome to read more about how we are changing the legacy data virtualization at http://www.queplix.com/solutions.html.

Sincerely, Steve Yaskin

Apr 16, 2011 11:29 AM bomboniere battesimo bomboniere battesimo  says:

Where can I find more information on ebizQ webcast?


Post a comment





(Maximum characters: 1200). You have 1200 characters left.




Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.

Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.