Doctors say that “prevention is the best medicine.” This holds true for data quality management, too. The sooner you identify potential problems the sooner you can handle them – and with less cost. This is the fundamental underpinning of any data quality management initiative in business.
Preventing data quality problems may seem like a “no-brainer,” but you’d be surprised at how many data warehouse and business intelligence (BI) project teams don’t plan for data quality issues. When you ask them why, the conventional response is that data quality management is the problem of the source systems. Although on one level that may be true, it’s an awfully caviler response from a team that’s responsible for providing accurate data to business users.
Data quality management recommendations
So where does data quality management fit in your data warehouse and BI project plans? You need to make it an integral part of every phase of your project – from requirements through design and production. A few specific recommendations:
- Ask for data quality performance measures as part of your business requirements gathering and prioritizing process. The business needs to define data quality and its metrics. You can’t fix it until you know there is a problem that you can measure.
- Determine, along with the business, how you are going to handle data quality issues both during the development process and when your processes are operational. The business must prioritize the problems that need to be monitored and fixed. And, most importantly, the business needs to agree on the price they will pay, in terms of resources, time and costs, to achieve their desired levels of data quality. If they want high levels of data quality they have to be willing to pay for it.
- Monitor your data quality using the agreed performance measures from data sourcing through information consumption. This includes all your data extracting, loading and transformation processes to all the data stores used by your reporting and BI environments. These data stores may include data warehouse(s), operational data stores, data marts, cubes and data shadow systems. You need to be able to measure the data quality at every stage where the data is touched – until the business user consumes it in a report or during an analysis.
- Create a data quality management dashboard to monitor the agreed upon data quality performance measures. This allows business users and IT to understand your current data quality levels and then take the appropriate actions. These dashboards should include data quality trending reports to analyze if the data is getting better or worse. Also create data quality alerts to enable corrective action on a more proactive basis. Don’t wait for the business user to discover that the numbers were wrong after they have already used them to make critical business decisions.
Data quality shouldn’t be an afterthought in your data warehouse and BI projects. But if you aren’t following these recommendations already — at least you’re not alone. Many companies struggle with data quality management. That doesn’t excuse you, but it means you have a lot of company. The best course of action is to take steps to prevent data quality problems up front, so you can avoid the bitter medicine of dealing with them later.