Defining Master Data for Your Organization

Loraine Lawson

David Loshin recently asked leaders from four master data management vendors to do something that you might think would be pretty simple: Explain what defines data as "master data?"


With these things, there's always the simple answer-the sound bite, if you will - and then the complex, but more useful, answer. Of course, the panel came back with the simple answer first, which Loshin summed up as, "data concepts that 'are important' to the business and are shared by two or more applications."


It took Loshin three tries to make progress on the more complex answer, which really is about how organizations figure out what constitutes their master data, as opposed to their more mundane data. The panel ultimately settled upon a fairly predictable answer - analyzing your data so you know how it's used, who's using it, which applications are using it, and so on.


But you can tell from reading Loshin's post that he's not quite satisfied with the answer, and I know how he feels, because I've had my fair share of interviews that waffle between simple sound bites to more complex, but still overly general.


I also had to smile, because it turns out, a year ago, I asked Loshin a similar question - what is master data and can the definition differ. I actually wanted the definition, not the process for defining it - alas, I wasn't swift enough to ask that. Fortunately, he's better at answering questions than asking them, because it only took one try. His explanation of master data:

"There have been many definitions proposed out in the general literature, and I have always been careful to say that I am providing a "description" of what I believe MDM incorporates rather than a definition. Master data objects represent the core business concepts used in the different applications across the organization, along with their associated metadata, attributes, definitions, roles, connections, and taxonomies, such as customers, suppliers, parts, products, locations, contact mechanisms."


He then went on to define master data management and how it can fit in with an IT strategic plan.


How organizations define what is and is not master data is an excellent question, really, and grossly under discussed, consider how foundational it is. In fact, I don't think I've seen anything on the topic-everybody just skips straight ahead to master data management. As an example, this week, Information Management published a piece on the MDM maturity model, which is exactly the type of piece you would think would mention this process. It outlines five levels of maturity and includes an excellent chart, in a .pdf, outlining how integration, governance and so on evolve as MDM matures. BUT, there's no level that simple includes "define your master data."


Perhaps we can assume that's a zero-level task?


As it turns out, defining master data for your organization may not be as hard as you'd think. Marty Mosely, who participated in Loshin's panel on behalf of IBM's Initiate Systems, saw Loshin's post and responded with a much more useful answer.


First, he explains, defining master data should be a top-down exercise. He suggests leaders start by identifying which "subject areas" define their data-and he says they're usually able to do this quickly and without profiling all the data. Then, they should consider whether the data in each subject matter matches three criteria:


  1. Is this subject area a building block of critical business transactions? He even specifies what he means by "critical."
  2. Are the data in the subject area created and managed in multiple systems?
  3. If those data are incorrect, inaccurate, incomplete, mismanaged, etc. do they have the potential to do harm to the organization?


"I've found that most organizations can easily agree which of those subject areas fall into the category of 'master data,'" Mosely adds.


I suggest you read the full post and the comments. Hopefully, by the time you do, more MDM experts and leaders will have chimed in with their thoughts on this critical, but under-discussed, question.



I did find some older, useful conversations about master data management, including this thorough Knowledge Integrity blog post, another BeyeNetwork piece written by, appropriately enough, Loshin, and an MSDN Architecture Center article that discusses identifying master data. I noted with interest the MSDN piece points out that not all master data needs to be managed by MDM. But all the articles were from 2006. They're still useful reads, especially for technologists, but they focus primarily on defining master data, with little information about how you can define your own master data.

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


Add Comment      Leave a comment on this blog post
Mar 19, 2010 10:29 AM David Loshin David Loshin  says:

Even my own bloated definition is insufficient, and the complexity of the answer should indicate that its immediate usefulness is limited. Unfortunately, we get asked the same question (rather the demand: "Explain to me how you tell whether something is master data") and in each case the answer I want to give is different.

It becomes even more difficult when there is no advice as to the semantics of the underlying data variants. You have one customer database, and you call that thing a customer record, but you have no definition of customer. Now multiply that ten times and then try to dump all those records into a single repository. You end up with YADS - yet another data silo.

Marty's further refinement has merit, except when you try to scratch beneath the surface yet again. You need more definitions:

What is a "subject area"?

What is a "critical business transaction"?

What is meant by "harm"? What are the tolerance lebvels? How is that measured? If it is just potential, then your answer is specualtive.

Let me add an additional spin to this dilemma: we were approached by a company that has an MDM implementation in place. That implementation has been the subject of a "success story" (names are not provided to protect the "innocent"). Why were we approached? They wanted a process to assess the use of the master data downstream and understand whether that data was suitable for those uses. Huh? According to Marty's approach that should have been done long before selecting a product, let alone implementation and deployment into production. Yet that is an MDM success.

Let's propose that we start from an even more fundamental perspective: Tell me how to determine when reference data should be master data. You may find that Marty's criteria are more adaptable:

- Critical businss transactions are likely to depend on reference data (where did you buy that thing?)

- Reference data is created and used by multiple systems (how many different reference tables exist for state code?)

- Incorrect reference dat acan pose irreperable harm to the organization (uh, what were the engineering specs for those automobile brakes again?)

The next step: let's figure out the concrete steps that a data analyst needs to take to consider proposing as master data: profile, analyze, document metadata for, review the decision criteria, and then make the decision. Maybe I will use that as the subject of an upcoming b-eye-network article.

Mar 19, 2010 10:32 AM David Loshin David Loshin  says: in response to David Loshin

Oh, and you can still buy my book at 49% off at Amazon. use this link:


Mar 23, 2010 1:00 PM Mark Shainman Mark Shainman  says:

As you have pointed out, Loraine, defining master data is easier said than done.  Still, I think you can safely say that the complexity in defining and managing an enterprise's important master data is evident when the master data is consolidated or is shared among applications.  While master data usually refers to reference data, such as customer and product, it can also include critical relationship data, such as relationships between products and customers as well as hierarchies such as organizational, product, and supplier. Defining and managing all of this data at an organizational level is possible, but we all know how difficult it can be when records are siloed and/or inconsistent across operational systems.

That being said, one of the first places that inconsistencies in master data become apparent and need to be solved is within a data warehousing environment, where data from multiple systems is consolidated on a single platform. A component of effectively implementing a data warehouse is defining the master data with a goal of integrating data and increasing overall efficiencies through faster and more effective usage of IT and analytics.  Because the process of implementing an enterprise data warehouse includes the cleansing, standardizing, and integrating of data to create common definitions across the enterprise, much of the MDM heavy lifting is and needs to be done during this process.

Mark Shainman

Global Program Manager

Teradata MDM & Oracle Migration

Mar 25, 2010 1:19 PM Nick Bacon Nick Bacon  says: in response to Mark Shainman

As businesses are different, so too is the importance of the data they are trying to manage. For instance, I guide clients on expense cycle, strategic sourcing etc, and I find each client needs to analyse data in the same way, but to drill down according to their spend mix. In this way their master data definition is different, and their need to analyse is different.

For me, this means that MDM is not a 'one size fits all', the components of master data and criteria are different.

Why do we need to 'beat ourselves up' defining master data when it is obvious what is is when we look at the client business reporting and analysis needs? Am I missing something?

Sep 13, 2010 7:47 PM David Loshin David Loshin  says: in response to Nick Bacon

The more I think about it, the more I think there is a definite need for clearly-defining what is or is not master data. If all the pundits are right, then we also need data governance (also relatively ill-defined). But what is governance other than formal oversight policies for ensuring compliance with specific criteria for support and performance with respect to the ""master" data set? Therefore, we need clarity on what master data is and is not, what are the expectations, the associated processes, performance objecties, and measures of compliance. How can you tell if you are doing it apropriately if you can't define it?

Jun 1, 2015 6:46 PM Rob Rob  says: in response to David Loshin
I agree David, a clearer definition is required for DG and MD. I do let my clients know Master Data is the most static data in the main ERP which are shared across objects to compete transactions (SAP). for this we can use governance to measure and keep the data as clean as possible. If there are no measurements then there is no ability to keep the data pure. That leads us to define what pure data are. To define this is one of the very purpose of DG The main purpose for Data Governance (DG), is to create the best version of the truth where the Company goals are in focus. It is up to the business to define and lead the charge depending on what analytic structure they need to see to strategically to place their company while getting the most market share possible. All this of course with in the law and principles of commerce. Reply
Dec 11, 2015 4:04 PM Jeff Jeff  says:
Hi folks. I agree as well that this is a challenging area to nail down. I've been working with master data since 1995, and find I'm often tongue-tied when this question comes up, usually in C-level discussions. In general, I believe Master Data is best described as "the key data that standardizes the context around the events that an enterprise manages". Here I'm not distinguishing between reference data and master data, mostly because they're just levels of the same thing. But, in general, the reason we want master data (and reference data) in our enterprise data management design is to ensure we can initiate, operate, execute, and report on enterprise events in a common context. I won't try to tackle Data Governance in the same comment box. But I'm interested in anyone's feedback on the above. Reply
Jan 6, 2017 10:22 AM Meg Meg  says:
Hi, I work in a non profit education organization and we are looking for MDM solutions , meanwhile we have started with Data Governance initiative where we will identify our master data domains, reference data and also identify data stewards and owners for these domains. As you see we are just the beginners and in learning process. Having understood the definitions and criteria for Master and reference data, I would like to get some guidance on how to identify master data and what is the best practice for documenting it so that it is easy to manage going forward.Currently we have not started implementing MDM tools so the solution for now is to do it manually. Any help will be highly appreciated. Reply

Post a comment





(Maximum characters: 1200). You have 1200 characters left.




Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.

Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.