We all know the importance of establishing SMART goals. The same applies to data quality initiatives and should be a key feature of your data quality tool. Unless you are able to put a tangible, measurable definition of data quality, it cannot be actioned upon. We all know, however robust a process might be, unless its performance can be put on a scorecard and discussed in leadership meetings, no actions will take place.
Unlike business metrics like sales volume, revenue, margin, measuring data quality is not straight forward. For example, one data quality rule might be measuring the count of exceptions where the customer address is empty in the customer table. Another one could be measuring the integrity of shipment volume between ERP and Data Warehouse. How do you aggregate all the findings into a data quality score? The number of failed tests or number of exceptions is one way to do it albeit not as effective and generates more noise. It is almost like measuring sales performance by the number of sales orders rather than measuring by sales volume.
Look for data quality tools that are able to establish and aggregate a data quality score across multiple data quality rules. Some tools can create something called “cost of data quality”. But DvSum DataPARC (Profile, Audit, Review, and Comply) creates a unique and actionable data readiness score. This kind of metric allows aggregation across multiple data quality tests to come up with measurable data quality metrics.
DataPARC automatically calculates a readiness score for every type of data quality rule, whether it is a record level exception, aggregate volume exception, or integrity comparison of multiple data sources. This allows data owners, managers, and executives to align and track a single metric that measures the quality of data and helps drive continuous improvement.
Managers and Super-users are able to monitor the trend of readiness score over time to quantifiably measure the performance of the data quality initiative. They can also drill-down to determine which data owner, data source, or data sets are trending down or lagging behind and drive focused improvement actions.