ISO/IEC 25059 — Quality Model for AI Systems

Software engineering — SQuaRE — Quality model for AI systems (ISO/IEC 25059:2023)

Introduction to ISO/IEC 25059

ISO/IEC 25059:2023 is an application-specific extension to the SQuaRE (Systems and software Quality Requirements and Evaluation) series, providing a dedicated quality model for AI systems. As artificial intelligence and machine learning systems become increasingly embedded in critical infrastructure, healthcare diagnostics, autonomous vehicles, and financial decision-making, the need for a systematic quality evaluation framework has never been more pressing. Unlike conventional software, AI systems exhibit probabilistic behavior, adapt during operation, and can produce different outputs from the same inputs due to continuous learning. These unique characteristics demand an extended quality model that addresses properties such as transparency, robustness, functional adaptability, and societal risk mitigation.

ISO/IEC 25059 extends ISO/IEC 25010 by adding AI-specific sub-characteristics while maintaining full backward compatibility with the existing SQuaRE framework. This allows organizations to integrate AI quality evaluation into their existing software quality management processes.

AI System Product Quality Model

The product quality model defined in ISO/IEC 25059 builds upon the eight primary characteristics of ISO/IEC 25010 — functional suitability, performance efficiency, compatibility, usability, reliability, security, maintainability, and portability — while introducing new and modified sub-characteristics specifically tailored for AI systems.

Characteristic Sub-Characteristic Type Description
Functional Suitability Functional Adaptability New Ability to accurately acquire information from data or previous actions and use it in future predictions
Functional Suitability Functional Correctness Modified Provides correct results with needed precision; AI systems typically do not guarantee 100% correctness
Usability User Controllability New Degree to which a user can appropriately intervene in AI system functioning in a timely manner
Usability Transparency New Degree to which appropriate information about the AI system is communicated to stakeholders
Reliability Robustness New Ability to maintain functional correctness under any circumstances, including adversarial inputs
Security Intervenability New Degree to which an operator can intervene to prevent harm or hazard

Functional Adaptability is particularly noteworthy as it captures the unique ability of AI systems to learn and adapt. Unlike conventional software where a fixed function produces deterministic outputs, AI systems can modify their behavior based on new data. This introduces both opportunities and risks — higher adaptability can improve outcomes but may also reinforce negative human cognitive biases if decision paths with high uncertainty are reinforced based on previous choices.

Robustness addresses the critical concern of maintaining performance under unseen, biased, adversarial, or invalid data inputs. This is essential for safety-critical applications where system failure could have severe consequences. The standard specifically links robustness to functional safety requirements, referencing ISO/IEC TR 5469 for AI-specific functional safety guidance.

A key engineering insight: there is often a trade-off between robustness and functional correctness. Research cited in the standard (Zhang et al., 2019) demonstrates theoretically principled trade-offs where improving robustness can reduce accuracy, and vice versa. Engineers must carefully balance these competing objectives based on the specific application context.

Quality in Use Model for AI Systems

Beyond the product quality perspective, ISO/IEC 25059 extends the quality in use model to address how AI systems interact with their environment and stakeholders. The most significant addition is Societal and Ethical Risk Mitigation, a new sub-characteristic under freedom from risk. This encompasses accountability, fairness and non-discrimination, transparency and explainability, professional responsibility, promotion of human values, privacy, human control of technology, and environmental sustainability.

The standard recognizes that AI systems can have far-reaching societal impacts that go beyond traditional software quality concerns. For example, a biased hiring algorithm could perpetuate systemic discrimination, while an opaque credit-scoring system could deny services without explanation. The quality in use model therefore considers not just whether the system meets its technical specifications, but whether it operates in a manner consistent with societal values and ethical principles.

Practical approach: ISO/IEC 25059 recommends combining quality-based and risk-based approaches (see Annex B). The risk-based approach, aligned with ISO 31000 and ISO/IEC 23894, allows organizations to address quality characteristics for which direct measures are not yet established — which is common for emerging AI technologies where measurement methodologies are still evolving.

Engineering Design Insights

From an engineering perspective, implementing the ISO/IEC 25059 quality model requires several practical considerations:

1. Transparency by Design: AI systems should be architected with built-in logging and introspection capabilities. Every data transformation step, model inference, and decision pathway should be traceable. The standard recommends documenting system decomposition, ML models used, training and validation data, performance benchmarks, and management practices. This level of transparency directly enables debugging, auditing, and regulatory compliance.

2. User Controllability and Intervenability: Systems must provide mechanisms for human operators to monitor, interrupt, and override AI decisions. This goes beyond simple kill-switches — it requires meaningful state observation and the ability to transition from unsafe to safe states. For example, an autonomous vehicle should not only allow the driver to take control but should also clearly communicate its current state and intended actions.

3. Measuring Correctness in Probabilistic Systems: Traditional software can be verified against binary pass/fail criteria, but AI systems require statistical performance evaluation. The standard references ISO/IEC TS 4213 for ML classification performance assessment methodologies, including precision, recall, F1-score, and confusion matrix analysis.

Quality Aspect Traditional Software AI System
Behavior Deterministic, repeatable Probabilistic, adaptive
Correctness Binary (pass/fail) Statistical (confidence intervals)
Failure Mode Bugs, crashes Bias, drift, adversarial vulnerability
Verification Formal methods, testing Validation datasets, continuous monitoring
Quality Evolution Stable after release May degrade or improve post-deployment

Frequently Asked Questions

Q1: How does ISO/IEC 25059 relate to ISO/IEC 25010?
ISO/IEC 25059 is an application-specific extension of ISO/IEC 25010 for AI systems. It inherits all characteristics and sub-characteristics from ISO/IEC 25010 while adding new AI-specific sub-characteristics (functional adaptability, user controllability, transparency, robustness, intervenability) and modifying existing ones (functional correctness) to account for the unique properties of AI systems.
Q2: What is the difference between transparency and explainability in the standard?
Transparency is defined as the degree to which appropriate information about the AI system is communicated to stakeholders — it is a property of the system and its documentation. Explainability, while related, focuses on the ability to provide understandable reasons for specific decisions. Transparency enables explainability by providing the necessary information about system internals.
Q3: Can ISO/IEC 25059 be applied to all types of AI systems?
Yes, the quality model is designed to be technology-neutral and applicable across various AI approaches including machine learning, symbolic reasoning, and hybrid systems. However, the specific measures and evaluation methods may need to be tailored to the particular AI technology and application domain.
Q4: How should engineers handle trade-offs between competing quality characteristics?
The standard acknowledges trade-offs (e.g., between robustness and functional correctness) and recommends a risk-based approach (Annex B) to balance competing objectives. Engineers should use the quality model to identify relevant characteristics, then apply risk management techniques from ISO/IEC 23894 to prioritize and make informed decisions based on the specific application context and risk tolerance.

Leave a Reply

Your email address will not be published. Required fields are marked *