Maintaining data quality is like taking a shower: It’s something you want to do on a regular basis, says Pentaho Chief Strategy Officer Richard Daley. But in recent years, too many organizations tried to get by with a once-a-year shower. Is it any wonder our data is so dirty? In this Q&A with IT Business Edge’s Loraine Lawson, Daley explains how organizations can clean up their act and their data.
Lawson: You think internal processes are part of the problem when it comes to data quality. What do you mean?
Daley: We’ve moved pretty fast in the last five, six, seven years, whether it be on-boarding new SaaS applications, new trading partners or moving things to the cloud. But data quality has not been at the forefront of a lot of these moves. People are starting to experience that they’re not getting the value out of all of these investments that have been made. One of the reasons why is because they didn’t really put quality, not just at the forefront, but as a way of life, integrating that into all of their processes in a daily or weekly basis rather than just taking one really good shower a year.
Lawson: Is that what you’re seeing: data quality as a once-a-year shower?
Daley: If that.
The first part is who owns it? Is it IT sometimes are the gatekeepers? Is it the business? Is it sales? Who is the ultimate throat to choke for the quality problem? I never get a clear answer to that; I certainly don’t get a consistent answer to that. There’s a lot of people who complain about it, but the ownership is debatable as to where and who is going to actually take responsibility for doing something about it.
Lawson: You’re out there talking to people, selling your solution, and obviously you have a vested interesting in convincing them you can help with this. But do they understand this is not just a technology problem? To what extent do they expect you to fix data quality and governance issues for them?
Daley: I don’t know that they look to a company like ours and say it’s our fault, but if we couldn’t provide something to be part of the solution, then they would look at us as an incomplete or inadequate partner. It used to be that data quality was its own separate thing. Now it’s merged into the data and application integration space. It’s a core component of it.
It’s only now that people are realizing the cost. People have tried to ignore the impact of bad data, so it just got worse and worse and worse. Now they’re looking back at all of the different fantastic systems cloud and on-premise systems that they’ve done and they’re wondering why they’re not getting what they need out of it. It’s always been this problem for BI. For BI, in that world, if the quality was bad 15 years ago, then you’d say then our BI is going to be pretty bad. But now it’s everything.
So if you think about hard-core areas where we’re strong, if you think about billing, tax and marketing automation — those are just three areas where people can’t afford to do faulty billing. It certainly wouldn’t be wise for them to have problems when it comes to tax. And then marketing automation, if you’re not marketing to the right place, if you’re not able to know who your customer is and what they do, then you’re going to be wasting a lot of time.
Take the CFO for example. I believe at one point the CFO might have thought, “Well that’s just sound like it’s noise — marketing always comes up with noise.” Now, because they’ve invested hundreds of thousands or millions of dollars in these various systems, they’re asking, “Where’s my return?” and a lot of times they’re not seeing that return because of the quality issue that they’ve ignored for so long. So now they’re interested.
Lawson: What sort of mistakes do you see other than the issue of determining who is in charge?
Daley: For us, there are several pieces. When you first implement, are you dumping garbage in at the beginning? What is your quality assessment, what’s the process for scrubbing before you even implement a new system, a new application? Most of the time, those things are behind and they’re over budget so they just say, “Well just stuff it all in there and we’ll figure it out after the fact because we have certain MBOs (management by objectives) based on launching this thing on time.”
Getting in and being able to do that in what’s already a high-pressure situation is critical. If we can help them get in front and tell them how important it is and how much it will save them over time, then that’s big.
The second part is these in-process data validations. You have multiple systems, they’re all talking to each other in a batch or real-time way and they’re coming from different lines of business, maybe different departments, that have different rules. You need to make sure that that’s being handled and that’s where it gets really hard, because then you start talking about master data management and who is the data steward and where do you store this data and what is the system of record and who wins in case of a conflict. So you really have to be able to get everybody together and agree that it’s important.
Then get a mediator. If it’s the CFO fine, but it’s usually not. It’s usually somebody in IT or a panel of the business owners from different departments who agree that it’s a big enough problem where they may have to make some concessions. That’s also hard but the pain has gotten so bad that I believe people are actually willing to come to the table and figure out a way to fix it.
Lawson: To what extent is this IT-created problem? We talk about data governance being something the business needs to own, but at the same time, when you talk about the rush to implement applications, I know that many IT departments are run in silos, where the developers who manage apps are different than the data people. Is that part of the problem?
Daley: I think that all of these companies, whether it’s sales or marketing or financial apps, have tried to bypass IT by buying a different application. IT was always this bottleneck cause they were always the rule makers, saying “Let’s check with everybody and let’s get buy-in,” marketing is saying, “We don’t have the luxury of waiting on this or that so we need to go ahead and do this and we promise we’ll come back and resolve it later.”
Going around IT is one of the reasons why everybody is in the bind that they’re in today. So I’m not saying that IT didn’t kind of deserve it; sometimes, they did take their muscle and maybe they were a little bit slower to react and respond than they should have been. They had all the power, so they said, “We’ll get to it when we get to it and you’ll deal with it.” So people had to go move.
And this isn’t just data quality — there are a lot of places where IT stops and says let’s run this through our really, really tough process, which is going to cost you an extra month.
But I think they’re coming together on both sides because now everybody has a really big mess. And it’s impacting the very people who went around IT to begin with and they’re looking for help. They can’t dictate to another group in another part of the company this how the data is going to be managed. They don’t have that jurisdiction.
Lawson: You’ve said if a company doesn’t have the right approach to cleansing and consolidating data to determine what they can and can’t use, they can wind up with bad data. How do you figure out what you can and can’t use?
Daley: You have to come to a consensus on some type of a standard for the company.
Then, you have to have a steward that’s going to care. You have to have some type of an owner that’s going, “I’m going to play the traffic cop here and make sure that everybody gets what they want.” And then you need to know which will end in the event of a conflict and which system will trump the other one.
Everybody is going to say their system is the most important, but I can promise you if you’re accounting, anything you invoice is most important, so that should win in case of a conflict there, whereas in customer and contact data, that’s more of a sales and marketing function. They own the customer. Then you have assets and that usually belongs to somebody on the product or support side to understand what they own and what they have.
If everybody comes together and understands why one component is important to the whole, then they’re more likely to play nice. If they only see it from their end, then they’re only going to do what’s most convenient for themselves. Hell, they still might do that anyway. At least you can try.