Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
ISO/IEC 27565 provides comprehensive guidelines for protecting personally identifiable information (PII) throughout the lifecycle of artificial intelligence (AI) systems, from data collection and model training through deployment, inference, and retirement. As AI systems increasingly process vast quantities of personal data for training and operation, they introduce unique privacy risks that traditional data protection approaches do not adequately address. These include model inversion attacks that reconstruct training data from model outputs, membership inference attacks that determine whether a specific individual’s data was used in training, and the risk of unintended memorization of rare or unique records in training datasets.
The standard categorizes AI-specific privacy risks into three distinct groups: training data privacy risks (occurring during data collection and model training), model privacy risks (embedded in the trained model’s parameters and behavior), and inference privacy risks (arising when the model processes new data or generates outputs). Each category requires different mitigation strategies and engineering controls.
| Risk Category | Specific Attack/Threat | Affected Lifecycle Stage | Primary Mitigation | Effectiveness |
|---|---|---|---|---|
| Training data privacy | Data breach during collection or labeling | Data collection, preparation | Data minimization, access controls, encryption | High |
| Training data privacy | Unauthorized data inference via model | Model training | Differential privacy during training | High (provable) |
| Model privacy | Model inversion attack | Deployment, inference | Output perturbation, model pruning | Medium |
| Model privacy | Membership inference attack | Deployment, inference | Regularization, DP training, output restriction | Medium-high |
| Model privacy | Model extraction via API queries | Deployment, inference | Query rate limiting, output perturbation | Medium |
| Inference privacy | Attribute inference from model outputs | Inference | Output filtering, confidence score masking | Medium |
| Inference privacy | Unintended memorization | Training, inference | Deduplication, differential privacy, record suppression | High |
ISO/IEC 27565 provides detailed engineering guidance for implementing privacy-preserving AI systems across the entire lifecycle. For training data, it recommends data minimization through active learning techniques that select only the most informative samples for labeling, and systematic screening for rare or unique records that are at highest risk of memorization. For the training process itself, differentially private stochastic gradient descent (DP-SGD) is presented as the primary technical control, with guidance on privacy budget (epsilon) allocation based on the sensitivity of the application domain.
At deployment time, the standard recommends implementing inference privacy controls including output perturbation for API endpoints, query rate limiting to prevent extraction attacks, and confidence score masking to reduce information leakage. For models deployed in regulated domains such as healthcare and finance, on-device inference is encouraged to avoid transmitting raw PII to cloud-based inference endpoints whenever possible.
The standard addresses the organizational and governance dimensions of AI privacy, emphasizing that technical controls alone are insufficient without proper governance structures. It recommends establishing an AI privacy review board that evaluates new AI use cases before deployment, conducts privacy impact assessments specific to AI characteristics (model reversibility, data retention in parameters, inference leakage potential), and maintains an inventory of AI systems with PII processing classifications. The standard also provides guidance on transparency obligations, including model cards and dataset documentation that disclose privacy-relevant characteristics such as training data sources, de-identification methods applied, privacy budget consumed, and known limitations regarding re-identification risk.