I hate doing laundry, and I freely own that hatred. The stain treatments, the forgetting to switch it to the dryer, the folding and ironing — none of it appeals. And then there’s the time factor: Every basket is a race to put it away before our 18-month-old dumps it all out or the dog nests on top of it, or both.
Data quality strikes me as the business equivalent of laundry. It just never ends, with the cleaning up of the data, the removing bad records, the synchronizing.
Given my laundry aversion, I can sympathize with organizations that just aren’t that diligent about data quality. They let the problems pile up, then try to do it all at once because it’s just such a pain. Heck, who among us hasn’t lost a weekend to piled-up laundry? Let that person throw the first stone.
To get laundry done, I have to incorporate it into a routine. Since I work from home, near the laundry room, that’s become how I make myself tend to the laundry. A load comes down with the laptop, goes in. When I get up for some other reason, I make the switch. And then the clean clothes go up with the laptop at the end of my work time.
On days when I don’t work in the home office, my husband walking in the door is my backup plan. For some reason, just seeing him reminds me the laundry needs to be switched, and I usually send him packing down a load shortly after.
Perhaps organizations need to take the same approach to data quality. Instead of doing it all at once, find a way to work it in whenever and wherever you can.
Typically, data quality comes up when there’s a big project, like a new CRM system or a master data management initiative. It’s usually coupled with the integration work, which is well and good.
Here’s another area that should trigger a good data quality cleansing: data migration.
Yet, often due to time constraints, data quality is often given short shrift during migration projects, writes Dylan Jones in a recent Data Roundtable post. Often the hard work of data quality is shrugged off with a “we’ll fix it later” in the target, he writes.
And that’s just bad planning, because, really, which is more likely to work: forcing business users to cleanse the data while learning a new system, or cleaning the data before you move it?
Of course, it’s not so simple as just saying, “Fix it during migration.” You also have to decide at what point in the migration you should deal with data quality. Should it be the staging area? An offline staging area? Or online staging area? Or do you clean it in the legacy system — which you may have acquired and which may be a big hot mess?
Each has its pros and cons, which Jones outlines. You'll need to decide where to fit in data quality after profiling the data, he states.
“When and where to cleanse your data in a data migration is not straightforward so don’t publish carte blanche policies, he writes. “Profile your data early and examine the different types of issues. What you’ll find is that multiple approaches will be required.”
I can certainly understand that it might take multiple approaches to get data quality done, because here’s the thing about data quality and laundry: Neither are ever going to be finished, no matter how much or how often you do them or how many stars you wish upon.
Trust me on that last one. I’ve already tried.