Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
ISO/IEC 25389:2021 provides a comprehensive framework for data quality management within the broader context of information technology and data management. Published as part of the ISO/IEC data management standards family (alongside 25422 on provenance, 25434 on reference data, and 25642 on master data), this standard defines a structured approach to defining, measuring, and improving data quality across the data lifecycle. For data architects and governance professionals, 25389 fills the gap between abstract quality principles (ISO 8000) and implementation-specific quality plans.
The standard identifies 15 data quality dimensions organized into four categories. Intrinsic dimensions (accuracy, consistency, objectivity, believability) address the data’s inherent quality independent of context. Contextual dimensions (relevance, timeliness, completeness, appropriate amount) assess fitness for purpose. Representational dimensions (interpretability, ease of understanding, concise representation, consistent representation) focus on format and clarity. Accessibility dimensions (accessibility, access security, availability) address the ability to retrieve and use data.
An important engineering insight from this standard is that not all dimensions are equally important for all use cases. The standard explicitly recommends a prioritization exercise at the start of any data quality program. For example, in a real-time fraud detection system, timeliness and accuracy rank highest; in a regulatory reporting context, completeness and consistency are paramount. This contextual weighting prevents wasted effort on measuring dimensions that have no material impact on the business outcome.
| Category | Dimension | Measurement Approach | Typical Threshold |
|---|---|---|---|
| Intrinsic | Accuracy | Record-level comparison to authoritative source | ≥ 99.5% for critical fields |
| Intrinsic | Consistency | Cross-record constraint validation | ≥ 99.0% |
| Contextual | Timeliness | Data age vs. service-level agreement (SLA) | ≤ 24 h for operational systems |
| Contextual | Completeness | Non-null ratio for mandatory fields | ≥ 99.9% for key identifiers |
| Representational | Interpretability | Metadata coverage and data dictionary adherence | 100% for published datasets |
| Accessibility | Availability | Uptime percentage of data access endpoints | ≥ 99.9% (three nines) |
The standard defines a seven-step continuous improvement cycle: (1) define quality requirements based on stakeholder needs, (2) establish measurement criteria and thresholds, (3) assess current quality levels through profiling and auditing, (4) analyze root causes of quality issues, (5) plan and implement improvement actions, (6) monitor quality levels over time, and (7) communicate results and adjust requirements. This cycle aligns with the Plan-Do-Check-Act (PDCA) model familiar to quality management professionals.
From an implementation perspective, Step 3 (assessment) is where most data quality programs either succeed or stall. The standard recommends automated data profiling tools as the primary assessment mechanism, supplemented by manual sampling for dimensions that cannot be algorithmically verified (e.g., believability, which requires domain expert judgment). Integration with data catalog tools is critical — the standard explicitly links quality metrics to metadata management.
The standard positions data quality management as a core function of enterprise data governance rather than an isolated technical activity. It recommends establishing a Data Quality Steering Committee with representation from business, IT, and data stewardship functions. Quality rules should be defined in a business glossary and enforced at the point of data ingestion through automated validation workflows deployed in the data pipeline (e.g., Apache Kafka schema registry validation, Great Expectations test suites in data warehouses).
A key architectural recommendation from the standard is the concept of a “quality ledger” — an immutable log of quality measurements, improvement actions, and residual quality issues. This ledger serves as the authoritative record for audit and compliance purposes. In practice, this can be implemented using a blockchain-adjacent architecture (append-only log with cryptographic verification) or simpler approaches such as a dedicated quality event store in the data lake.