Remember John Lennon, singing about instant karma? No? Some of you may want to download that song before embarking on a new business intelligence (BI) project.
When it comes to BI projects, many organizations are experiencing a sort of instant karma with data. Often, these problems trace back to a poor data integration strategy, according to not one, but two recent TechTarget articles.
While many of these data quality issues have existed for years, BI really seems to flush them out in the open, probably because it pulls data from multiple sources and consolidates it for analysis, which makes it painfully apparent when there are problems with the data.
Honestly, reading the articles, it’s not hard to empathize with the obviously frustrated data experts. You can almost see Ted Friedman, a Garner analyst and expert on data management, reaching for the TUMS in his side drawer, as he tells the TechTarget freelance writer Alan R. Earls: “I’ve been following data integration for more than 10 years. And I still spend days talking to organizations that are not getting the usage and trust and acceptance and value out of their BI efforts because the quality of the data is not good enough, and they haven’t done the right things to fix that.”
Honestly, most of this stuff is not rocket science. It’s more like Data for Dummies, actually — the kind of mistakes that happen because the project seemed so simple, no one bothered to work through the requirements.
Here are a few takeaways:
Figure out what data your BI project requires. No, seriously — IT needs to really listen to the business requirements, particularly when the data must be available in real- or near-real time, advised Claudia Imhoff, president of consultancy Intelligent Solutions Inc. and a respected voice in the data quality community. Incredibly, BI often turns off business users because somebody failed to load the right data. Don’t be that person.
Cleanse your data. It’s called data integration, not data migration, because there should be some processes involved — things like match-and-merge or transforming. You shouldn’t end up with six copies of the same record in the BI systems. Be more disciplined about your data integration, and that won’t happen.
Debate, don’t dump. OK, nobody wants to sit through a 3-hour debate over the meaning of “customer” or “address.” But dumbing down the data to avoid the hard work of compromise is NEVER the answer, advises Jill Dyche, co-founder of Baseline Consulting in Sherman Oaks, Calif. Along those lines …
Find the right people. Hint: They will not be in IT. Somebody, somewhere, understands that data and that BI project. Typically, they’re power users or the go-to person for a particular issue. Find that person, and recruit or draft them from the get-go.
Mend your data. Sometimes, the data problems start with the user. The user won’t appreciate you pointing this out, but life’s hard sometimes. If the sources have bad data, then integration won’t help. Good data quality starts at the source.
Take your time. Sure, you want to be responsive and hit your project deliverables, but don’t underestimate how much work data integration is. Imhoff points out integrating data and loading it into the data warehouse can take as much as 60 to 80 percent of the BI development effort. Rather than rushing, be conservative and multiple your time estimates accordingly.
Don’t forget testing. SAP independent consultant Ethan Jewett wrote the second TechTarget piece on data integration problems and BI failure. His recommendation was to create a better data integration testing strategy before you use the data. He’s speaking specifically to SAP data integration, but I think his advice applies for other BI projects. He suggests you test full-size and realistic data sets, as well as testing requirements early and often.
As Jim Harris pointed out to me once, the biggest data quality mistake is to not have a data quality program, but the second biggest mistake, he added, is to do data quality after the fact:
The analogy I like to use there is like it’s if your house is on fire, it’s not very difficult to get people to say, 'Hey, maybe we should put the fire out.' But, it’s very difficult to get people to practice fire safety. That’s what proactive data quality approach is: We should practice fire safety so that things don’t burn down. But it’s very hard to get executives to fund projects for potential future problems that could get caused by data quality versus the CIO could go to jail next week if we don’t get this regulatory compliance problem fixed.
Don't be that CIO.