Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
ISO/IEC 25012 addresses a critical dimension of software quality that is often overlooked: the quality of data itself. As organizations increasingly rely on data-driven decision making, machine learning, and business intelligence, the quality of underlying data becomes paramount. Poor data quality leads to flawed analytics, incorrect business decisions, and regulatory non-compliance. This standard defines a data quality model that categorizes quality attributes into fifteen characteristics viewed from two complementary perspectives: inherent and system-dependent.
The standard recognizes that data quality affects all information technology projects where data is exchanged, processed, and used between computer systems and users. Several factors drive the need for systematic data quality management: acquisition of data from organizations with unknown or weak quality processes, the existence of defective data contributing to unsatisfactory outcomes, dispersion of data across multiple owners and systems with inconsistent semantics, and the coexistence of legacy and modern systems that must interoperate. The data quality model provides a structured framework for addressing these challenges.
ISO/IEC 25012 organizes data quality characteristics into three groups based on whether they are viewed from an inherent perspective, a system-dependent perspective, or both. This three-way classification is one of the standard’s most distinctive features, as it recognizes that some quality attributes are properties of the data itself while others emerge from the interaction between data and the systems that manage it. Understanding this distinction is essential for designing effective data quality improvement programs.
| Perspective | Characteristics | Description |
|---|---|---|
| Inherent Only | Accuracy, Completeness, Consistency, Credibility, Currentness | Relate to data itself — its values, relationships, and business rules |
| Inherent & System-Dependent | Accessibility, Compliance, Confidentiality, Efficiency, Precision, Traceability, Understandability | Depend on both data content and the capabilities of the computer system |
| System-Dependent Only | Availability, Portability, Recoverability | Achieved through hardware, software, and infrastructure capabilities |
Inherent data quality refers to data’s intrinsic potential to satisfy needs regardless of the system storing it. Accuracy comprises syntactic accuracy (values conforming to domain rules, e.g., “Mary” not “Marj”) and semantic accuracy (values correctly representing real-world entities, e.g., the correct name for the right person). Completeness measures whether all expected attributes have values for each entity instance. Consistency ensures data is free from contradictions across related entities. Credibility captures the degree to which users regard data as true and believable, often tied to the trustworthiness of the data source. Currentness addresses whether data is of the right age for its context — a railway timetable must be updated with sufficient frequency to remain useful.
System-dependent data quality depends on the technological domain and infrastructure. Availability ensures data can be retrieved by authorized users and applications when needed, including during concurrent access and maintenance operations like backup. Portability addresses the ability to install, replace, or move data between systems while preserving existing quality. Recoverability ensures data can be restored after failures through commit/synch point mechanisms, rollback capabilities, and backup-recovery procedures. These characteristics are heavily influenced by architecture decisions and infrastructure investments.
From an engineering perspective, ISO/IEC 25012 provides several critical insights for data-intensive system design. The standard’s dual-perspective classification is particularly valuable because it separates data content issues from infrastructure issues — two problem domains that require fundamentally different solutions and skill sets. Data engineers can use this classification to assign ownership appropriately: business domain experts own inherent quality, while IT infrastructure teams own system-dependent quality.
The standard includes specific measurement examples for each characteristic. Confidentiality can be measured through encryption coverage as an inherent measure and through penetration test success rates as a system-dependent measure. Efficiency can be measured by comparing actual storage usage against optimized benchmarks. The Compliance characteristic is particularly relevant in regulated industries: the standard provides separate measures for inherent compliance (data content conforming to regulations like GDPR or HIPAA) and system-dependent compliance (technical architecture ensuring regulatory conformance). This distinction maps directly to real-world compliance implementation challenges.
From a practical standpoint, the standard’s measurement framework enables organizations to establish quantitative quality targets for each characteristic, monitor them over time, and drive data quality improvement initiatives with clear metrics. Organizations implementing data governance programs will find the fifteen-characteristic model provides an excellent checklist for defining their data quality dimensions and establishing measurement baselines.