Is your Masterdata clean?

A recent project reminded me of the real world scenario where masterdata is never clean. We are being told that the textbox is a free text and city can be a street, district or even province. Of course, this creates a challenging downstream impact for your data analytics.

Masterdata POV (Point of View)
  • Decide the POV of masterdata. One system masterdata could be another system transactional data and vice versa.
  • Use a reference point and communicate a common language of masterdata.
  • Knowing your end game will decide what masterdata to be collected with a relevant POV.
Tips to clean your masterdata
  • If more than 50% of your masterdata data needs cleansing, it is worthwhile to drop this masterdata,
  • Know what to clean and not to clean for the sake of cleaning.
  • A clean masterdata exhibits consistent patterns while an unclean one is a total chaos.
  • Know your domain well to clean effectively!

Cleaning masterdata is a iterative process. You get better and resilient with practice. A good data sense is also advantageous. Good luck cleaning and may the force be with you!


