When Data Virtualization Works - And When It Doesn't

Loraine Lawson

Loraine Lawson spoke with Composite Software Vice President of Product Marketing, Peter Tran, and the Director of Customer Value, Bob Reary.


Lawson: Your product recently made the IT press when Intelligent Enterprise ran a piece explaining how Pfizer used your solution to bring data from heterogeneous sources into one portal. Composite is an EII product suite. You offer a middleware solution, correct?
Tran: EII is a term that was probably used a couple of years ago. It hasn't been used so much recently. Nowadays, it's called on-demand integration as a category or more fashionably speaking, data virtualization. The concept is similar - virtualizing data is simply making data available to be used. And the location and the source of the data don't really matter. It seems to capture the imagination more than EII.


It's enterprise software. We're a products company. We are not a professional services company, although we do have some services to support our products. So primarily, we're a products company. Our company's background is primarily engineering and computer scientists, and that's the heritage of our company.


Lawson: I've read some criticisms that EII - or data virtualization - can undermine data architecture planning, because it can get the data from the source and bypass data management policies that monitor data quality and cleansing. What's your response to that?
Tran: I think the word "undermine" is probably unfair. I mean, if anything, it facilitates data management. Data is something that people within the company need to run their business and getting that information quickly and in a format that works for the business users. As architects, their job is to get that data in the right format and to the user. They want to get it to them quickly, also. So data warehousing or other technologies like that, which we complement, are one source or one way or one method of getting that data to the user.


But sometimes it's not fast enough, as in the Pfizer case study. That method is just not quick enough for a company trying to do things like drug discovery, where days and weeks matter - especially when the cycle's two years. If you can condense a week here and a week there, you condense the whole cycle and you have a competitive advantage. What we do is facilitate data getting to the right people. Not only do we get the solution to the business user quickly, but we get the information more quickly as well. What I mean by that is the solution is easy to use - that's one thing - but the timeliness of the data is also important. Data warehouses are latent in nature and ours is on-demand.


Lawson: In which business situations does data virtualization work best?
Tran: Data virtualization works best when you need timely information and you need it now. Let me give two examples of this, as illustrated in the Pfizer case study. The first is the timeliness of the solution - just getting that data to the user in a timely manner so it can be effective. Sometimes getting the solution that's perfect and beautiful, but four months later, is not effective at all. You need to ask what information is required right now - not within a week or so. And the second is just the timing of the information itself. Getting information 24 hours later is not as effective as getting information on demand, right now, when you need it. Where that is effective is in industries like the pharmaceutical industry - life sciences - and Wall Street, where their primary business is really information. That's what they provide to their clients, that's what they use to make decisions and they need to get that right away. So industries like that are where our solution is most effective.


Reary: We talked a lot about what we would call run-time results, delivering data faster. Another thing is the design-time benefits. Like if Pfizer and our other customers deploying applications take a lot of time, especially if they've got to build datamarts etc. With Pfizer, they cut their time in half to deploy these kinds of dashboard projects that they're building so they actually got a factor ROI into the time it takes to build these systems internally as well. It's not just the runtime, but also the design time. Maybe they can do up to as many as twice as many projects in the same amount of time as they could with the other conventional methods they used in the past.


Lawson: And I assume there are situations when you would not want to use data virtualization?
Tran: Definitely. Data warehouses have been around for 20 years and they'll be around for another 20 years at least. And in those cases where you need time series analysis, historical analysis, where you actually have to capture information that's historical - what happened yesterday, what happened two weeks ago - and analyze it, data warehouses are best. We can tap into a warehouse to get information for analysis but we would never move information [or duplicate information] into a repository.


Lawson: In the case of Pfizer, they actually already had your solution set up and in use in the finance department. I suspect that helped on their ROI, but for those who don't have a data virtualization solution sitting around handy somewhere, what's the entry-level cost?
Tran: It varies. We've had installations where it's under $100,000 and we've had enterprise licenses for millions. It depends on the capacity they need. It's hard for me to quote a typical project. Entry level, you can probably get into a solution for less than $100,000.


Lawson: Is it often that companies already have a solution in place, but maybe central IT doesn't know about it?
Tran: I don't think so. The person that we speak to most often when we enter an enterprise is the data architect and typically an architect has an overarching view of all the information in the architecture of an enterprise. It's not something that you can actually install very easily by business users. It requires some IT skills. So that's not going to happen at all. If anything, what they do is they install it in one department first to see the benefit and, with most of our clients, they quickly expand the use of it throughout the enterprise.


Lawson: Does your solution fit in with SOA?
Tran: Very much so. Definitely. One of the things you think about when you think about SOA is, you know, you don't care about what system provides the actual data or the function or the service as it's called, right? What we do as far as virtualization is very similar. We virtualize the data and we make it very simple to access regardless of where it sits. We also present that information as a Web service. If you work within the SOA context, then you can consume our data. We provide our data as a WSDL and, following the SOA protocols, you can actually access that data using SOAP.


Lawson: Are you a pureplay company?
Tran: Yes, that's probably the best way to categorize us. We consider ourselves best of breed. We're vendor-neutral. We fit into pretty much all the other components and applications out there. We're not a stack; we're not obligated to work with any kind of app server or any kind of database infrastructure. We're standalone and we fit into any infrastructure.


We're partners with Cognos, for example. They embed our products. We're also a partner with Informatica. But we can work with any BI tools out there. So if a company has Business Objects or Hyperion or MicroStrategy, we can present information to those applications. Our core competency is we are able to access sources and combine the data from those sources.

Add Comment      Leave a comment on this blog post

May 15, 2009 12:40 PM data virtualization solution data virtualization solution  says:

This is good information for us to share with clients - more benefits and ideas on selling virtualization solutions. 

Aug 11, 2011 2:22 AM Ash Parikh Ash Parikh  says:

Here is a discussion on what TRUE data virtualization is all about. Don't confuse this with simple, traditional data federation, which is a subset of data virtualization.



Aug 12, 2011 1:12 AM Ash Parikh Ash Parikh  says: in response to Ash Parikh

Informatica recently released the latest version of its data virtualization solution, Informatica Data Services version 9.1, as part of the Informatica 9.1 Platform.



Key highlights for this release are:

The capability to dynamically mask federated data as it in flight, without processing or staging, just like what we were doing with the full palate of data quality and complex ETL-like data transformations already possible on federated data. This is helping end users leverage a rich set of data transformations, data quality, and data masking capabilities in real-time, without additional overhead.

The ability for business users (analysts) to play a bigger role in the Agile Data Integration Process, and work closely with IT users (architects and developers), using role-based tools. This is helping in accelerating the data integration process, with self-service capabilities.

The ability to instantly reuse data services for any application, whether it is a BI tool or composite application or portal, without re-delpoyment or re-building the data integration logic. This is done graphically in a metadata-driven environment, increasing agility and productivity.

Here is a demo and chalk talk:




Ash Parikh


Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


Resource centers

Business Intelligence

Business performance information for strategic and operational decision-making


SOA uses interoperable services grouped around business processes to ease data integration

Data Warehousing

Data warehousing helps companies make sense of their operational data