Data warehousing is often made to sound overly simple these days. Just plug in one of the latest appliances and reap the rewards of unlimited business intelligence and data analytics.
The reality is quite different, starting with the fact that integrating all of the disparate data in even medium-sized enterprises can be quite a challenge.
Lately, however, a number of techniques have come up through the ranks that make data integration, if not completely enjoyable, at least less burdensome than it used to be.
One of them is called the patterns approach. As described here by Microsoft's Michael Eldridge, patterns aims to satisfy the conflict between enterprises' desire to have a uniform data set and individual business units' need for ready access to their particular data sets. All three of the enterprise data warehouse (EDW) patterns he describes require a centralized data store and a series of federated data marts, but they vary according to the level of centralized control and the use of non-common data models and sources. And there's no reason why various pattern architectures can't be implemented or even overlap in the same organization.
Regardless of which data-integration solution you adopt, the only way to optimize it for your particular needs is through trial and error, according to warehousing consultant Rick Sherman. Since both top-down and a bottom-up solutions can leave crucial details hanging, he recommends a mixed approach that takes into consideration existing data requirements and sources as well as large-scale architecture and work-flow considerations. In this way, you get a better view of integration as a process designed to overcome the inconsistencies that arise between data from various sources, rather than a product aimed at correcting errors.
Integration is also a key driver in many of the newest warehousing platforms. SAP and HP recently joined forces to provide a more uniform approach to their respective systems, considering the large number of customers that have already deployed some combination of the two. The deal calls for the union of SAP's NetWeaver warehouse and HP's NeoView system as a means to improve data availability in high-volume environments. The combined platform will no doubt take advantage of the addition of Informatica's data-integration stack to the NeoView system as well as HP's data management and QoS services portfolio.
Oracle users, meanwhile, should see increased integration capabilities with the Neotix Analytics software stack. The package provides a data-integration module for both Oracle and non-Oracle systems, and is powered by software from the recently acquired Jaros Technologies that provides business intelligence capabilities across disparate platforms while keeping data available for real-time reporting and monitoring.
Warehousing gives credence to the idea that the bigger you are, the more complex your data infrastructure. Very few enterprises have the luxury of re-architecting their data environments from scratch to foster a better warehousing and BI infrastructure. That means data integration will be a key requirement going forward.
It's not something to look forward to, but if you hope to extract the true value of your institutional data, you need to roll up your sleeves and get it done.