Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
ISO/IEC 27559 establishes a structured framework for de-identification of personally identifiable information (PII), providing organizations with a systematic methodology to reduce privacy risks while maintaining the utility of data for analysis, research, and business operations. The standard recognizes that de-identification is not a binary state but a continuum of risk reduction, requiring careful balancing between the degree of privacy protection and the analytical value of the resulting dataset. It covers all major de-identification techniques including generalization, suppression, perturbation, and synthetic data generation.
The standard categorizes de-identification techniques into several families, each with distinct characteristics regarding privacy protection strength, data utility preservation, and computational complexity. Selecting the appropriate technique depends on the specific use case, data type, and acceptable residual risk level.
| Technique | Privacy Mechanism | Data Utility Impact | Best For | Re-identification Risk |
|---|---|---|---|---|
| Suppression | Remove identifiers entirely | Minimal for analysis | Direct identifiers (names, SSNs) | Low when comprehensive |
| Generalization | Replace with broader categories | Moderate — reduces granularity | Quasi-identifiers (age, ZIP codes) | Medium |
| Perturbation | Add statistical noise | Moderate-high for aggregates | Numerical data, medical measurements | Low (with sufficient noise) |
| k-anonymity | Each record indistinguishable from k-1 others | Moderate | Structured tabular data | Low (fails for homogeneous attacks) |
| l-diversity | Ensures diversity within each equivalence class | Moderate-high | Sensitive attributes in groups | Very low |
| t-closeness | Attribute distribution matches global distribution | High | Skewed sensitive attributes | Minimal |
| Differential privacy | Mathematical guarantee via calibrated noise | High (epsilon-dependent) | Statistical queries, ML training | Provably minimal |
| Synthetic data | Generate artificial records from model | Variable (model-dependent) | Testing, development, sharing | Low (if properly generated) |
ISO/IEC 27559 prescribes a risk-based approach consisting of several stages. First, organizations must perform a re-identification risk assessment that identifies all potential attackers (motivated adversaries, curious insiders, accidental re-identification), their capabilities (access to auxiliary data, computational resources), and the sensitivity of the data being protected. The risk assessment then determines the required de-identification strength.
An essential concept introduced by the standard is the de-identification governance board — a cross-functional team comprising privacy officers, data scientists, legal counsel, and business stakeholders that oversees de-identification policies, approves technique selections, reviews residual risk acceptance, and handles re-identification incidents. This governance structure ensures that de-identification decisions are made with appropriate organizational oversight rather than left solely to technical teams.
The standard emphasizes that de-identification is not a permanent state. Advances in auxiliary data availability, linkage techniques, and computational power can increase re-identification risks over time. Therefore, ISO/IEC 27559 requires periodic re-assessment of published de-identified datasets. It provides guidance on monitoring the re-identification landscape, tracking published re-identification attacks, and determining when a dataset needs re-processing with stronger techniques. Organizations are advised to maintain a de-identified data inventory with risk ratings, re-assessment schedules, and sunset policies for datasets that can no longer be adequately protected.