The Metadata Conundrum

Arthur Cole

Arthur Cole spoke with Alex Gorelik, co-founder and CTO, Exeros.


Cole: Metadata integration has become such an overwhelming burden on most enterprises these days that there is often not enough time or resources to implement a full solution. What are the downsides to many of the stop-gap solutions that exist today, such as labeling and data profiling?
Gorelik: The downside is that metadata-based approaches can only find simple one-to-one relationships that are exact matches. So what data analysts do in the real world is to abandon the metadata and look at the actual data values. They print out sample data sets and then take a highlighter and start looking for patterns. In fact, the analysts spend 95 percent of their time looking at data values in order to figure out these complex transformations because the simple one-to-one relationships are easy to find. So at best, metadata tools are only accelerating 5 percent of the actual effort involved in complex integration projects.

Here is a real-life example: An insurance company has two applications. In Application 1 there is a column called Automobile Driver Age with ages ranging from 16 to 102. When a particular column is less than 26, then Youthful Driver Flag in Application 2 is "Y". Metadata can't help me find that relationship because the relationship is hidden in the data values. And this is just a simple example.


Cole: How does a solution like your DataMapper differ from the current best-of-breed software?
Gorelik: DataMapper is different because it examines actual data values, not the metadata, to discover the business rules and complex transformations that exist between systems. The result is over three times the time savings in deployment for large integration projects at a much higher level of quality. It also discovers the costly exceptions where the data breaks those rules, so we can guarantee the accuracy of the results the product discovers. So in the earlier automobile example, not only did DataMapper automatically find the transformation, it also found an 83-year-old man that was being charged as a youthful driver. That kind of mistake accounts for real business errors that can cause customers to be over- or under-charged.


Cole: IT executives have become accustomed to automated solutions for whatever ails them. Is it reasonable to expect a similar approach to migration?
Gorelik: There is no magic pill that will automate 100 percent of the process. While DataMapper is a huge leap forward in solving this problem, even it does not find every transformation automatically. That is why it consists of two components, a discovery engine that will find over 80 percent of transformations on its own and an analyst workbench environment that allows the data analyst to use a variety of tools to hypothesize and validate the 20 percent of transformations for those cases DataMapper cannot find. The bottom line is that DataMapper is like a really fast jet airplane; it will get you from A to B over three times faster and with greater reliability, but you still need a pilot (the data analyst) to fly it.

Add Comment      Leave a comment on this blog post

Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.

Resource centers

Business Intelligence

Business performance information for strategic and operational decision-making


SOA uses interoperable services grouped around business processes to ease data integration

Data Warehousing

Data warehousing helps companies make sense of their operational data