Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
ISO/IEC 25642:2023 defines a reference architecture for Master Data Management (MDM) — the integrated set of processes, governance structures, and technical capabilities for managing an enterprise’s core business entities (customers, products, suppliers, locations, assets) as trusted, authoritative, and shareable assets. The standard provides a vendor-neutral architectural blueprint that organizations can use to design, evaluate, or mature their MDM capabilities. As the capstone of the ISO/IEC data management standards family (alongside 25389 on data quality, 25422 on provenance, and 25434 on reference data), 25642 integrates these concerns into a cohesive architecture.
The reference architecture is organized into five layers: (L1) Data Source Layer — the operational systems (CRM, ERP, SCM) that create and consume master data; (L2) MDM Hub Layer — the core processing engine that ingests, cleanses, matches, merges, and publishes master data; (L3) Data Consumption Layer — analytical systems (data warehouse, BI, AI/ML) and operational systems that consume mastered data; (L4) Governance and Stewardship Layer — the tools and workflows for data governance, quality monitoring, and exception handling; and (L5) Infrastructure and Security Layer — identity management, access control, encryption, and audit logging.
The MDM Hub Layer (L2) is further decomposed into seven functional components: (1) data ingestion and parsing, (2) data cleansing and standardization, (3) identity resolution (matching/merge/survivorship), (4) golden record creation and versioning, (5) relationship management (hierarchy and cross-entity links), (6) data distribution and synchronization, and (7) hub administration and monitoring.
| Layer | Components | Key Engineering Considerations |
|---|---|---|
| L1 — Data Source | CRM, ERP, SCM, legacy systems | API versioning, change data capture (CDC), data quality at source |
| L2 — MDM Hub | Ingestion, cleansing, matching, merging, golden record | Scalability (horizontal), matching algorithm accuracy, latency |
| L3 — Consumption | Data warehouse, BI, operational apps | Data freshness SLAs, bidirectional sync conflicts |
| L4 — Governance | Stewardship console, quality dashboards, workflow | Role-based access, audit trail, exception handling |
| L5 — Infrastructure | IAM, encryption, logging, monitoring | GDPR compliance, data residency, encryption at rest/transit |
The standard identifies five MDM implementation patterns: (P1) Registry — a lightweight index that stores only identifiers and pointers to source records; (P2) Coexistence — the hub stores a golden record alongside the source records and publishes via API; (P3) Transaction Hub — the hub becomes the authoritative system for master data transactions, with source systems forwarding write operations through it; (P4) Composite — a hybrid approach where some entities use Registry and others use Transaction Hub; and (P5) Data Federation — no central store; master data is assembled on-the-fly via query routing.
For most large enterprises, the Composite pattern (P4) is the most practical. Customer master data may warrant a Transaction Hub due to compliance and privacy requirements, while supplier master data may be adequately served by a Coexistence pattern. The standard provides decision criteria (data volume, update frequency, consistency requirements, regulatory constraints) for selecting the appropriate pattern per entity type.
Identity resolution — the process of determining whether two records refer to the same real-world entity — is the most technically challenging component of any MDM system. The standard recommends a probabilistic matching approach (using the Fellegi-Sunter model or machine learning-based classifiers) over deterministic matching for all but the simplest use cases. The matching engine should consider multiple attributes with weightings, handle missing values gracefully, and produce a match confidence score.
The golden record construction process (survivorship) defines how conflicting attribute values from multiple source records are reconciled into a single authoritative value. The standard defines five survivorship rules: (1) most recent update wins, (2) most trusted source wins, (3) longest value wins (for string attributes), (4) specific source priority (e.g., CRM over ERP for customer name), and (5) manual stewardship override. These rules should be configurable per attribute and per source system.
No download files available yet