Executive Summary
Data quality is difficult to comprehend in its entirety, because of the diverse aspirations and actions collected under its broad umbrella. This includes standard technology and business practices that
improve data, like name-and-address cleansing, record matching and merging, house-holding, deduplication,
standardization, and appending third-party data. Some of these tasks can be automated
with software, while others—like entering data properly—are purely matters of business process.
Given this complexity, it’s no wonder misconceptions abound, like thinking data quality is a
one-time action that results in perfection. To the contrary, data quality is a complex concept that
encompasses many data-management techniques and business-quality practices, applied repeatedly
over time as the state of quality evolves, to achieve levels of quality that vary per data type and
seldom aspire to perfection.
Of the organizations TDWI surveyed, 82.5% continue to perceive their data as good or okay.
However, half of the practitioners surveyed warn that data quality is worse than their organization
realizes, which explains why the number of organizations with a data-quality plan doubled between
2001 and 2005. Many companies took action on data quality because compliance provided a swift
kick in the pants. Other kicks came from initiatives for business intelligence, customer service,
global supply chain, and IT system consolidations and migrations.
Two-thirds of respondents have studied the problems of data quality, while less than half have
studied its benefits. This indicates clearly that data quality initiatives are driven more by liability
than leverage. In other words, organizations improve their data to avoid problems like direct-mail
costs, misguided decisions, poor customer service, or faulty information in financial and regulatory
reports. Of course, when these problems are fixed, data has greater leveragability. The benefits aren’t
completely overlooked, since most organizations surveyed claim a return on investments in data
quality. Either way you look at it, the liabilities of poor-quality data and the leveragability of highquality
data should compel anyone to action.
Data-quality products and practices are evolving quickly as they move from technical to business
users, from point products to suites, from batch to real-time operation, from data profiling to
quality monitoring, from US-centric to global, and so on. All these trends boil down to the fact that
data quality is broadening beyond its departmental roots into enterprise-scope usage. While this
broadening is good for the data, it’s challenging for the organization, which must adjust its business
processes and IT org chart to adapt to enterprise usage.
Accomplishing anything with this kind of enterprise data quality (EDQ) requires close collaboration
among IT and business professionals, who understand the data and its business purpose—collaboration
made manifest in a data-governance committee or program. Data governance is rare today, but
will proliferate as companies take data quality into broader enterprise use and move beyond mere
stewardship. TDWI recommends data governance strongly, because it gives all data-management
practices consistency, efficiency, and mandate as they reach for enterprise scale.
Note that the most critical success factor for EDQ via data governance is mandate. Data stewards
and governors must induce technical and business managers beyond their purview to change their
processes and data when opportunities for data improvement arise. Without a strong mandate
(supported by an attentive executive sponsor) to drive pragmatic changes, EDQ, data governance,
and data stewardship deteriorate into an academic study of data.