ISO/IEC 25066 — Common Industry Format (CIF) for Usability Evaluation Reports

ISO/IEC 25066:2014 defines the Common Industry Format (CIF) for usability evaluation reports as part of the SQuaRE (Systems and software Quality Requirements and Evaluation) framework. It provides a standardized structure for documenting usability test results, making it possible to compare findings across studies, products, and organizations. For UX engineers and quality assurance teams, adopting the CIF means fewer ambiguities in handoffs and more reproducible evaluation workflows.

The CIF is designed to complement ISO 9241-11 (usability definitions) and ISO/IEC 25062 (usability test reports), but 25066 specifically targets evaluation reports that include both quantitative metrics and qualitative observations.

1. Core Structure of the CIF Evaluation Report

The CIF mandates a report structure with seven major sections: (1) executive summary, (2) product description, (3) evaluation context, (4) evaluation methodology, (5) data analysis and results, (6) findings and recommendations, and (7) appendices. Each section serves a distinct role and collectively ensures that a third party can reproduce the evaluation without referring to external documents.

From a design engineering perspective, the most critical section is “evaluation methodology.” It must specify the participant profile (sample size, inclusion criteria, domain experience), the tasks selected for testing, the test environment (laboratory, remote, field), and the metrics collected. Without this detail, the numerical results are essentially meaningless — a 90% task-completion rate in a lab with expert users is not comparable to the same number collected from novice users in the field.

Section	Required Elements	Engineering Value
Executive Summary	Goals, key findings, severity ratings	Quick stakeholder alignment
Product Description	Target user profile, key functions, hardware/software context	Establishes scope boundaries
Evaluation Context	Use scenarios, environmental conditions, constraints	Enables reproducibility
Methodology	Participant criteria, task list, metrics, data collection tools	Core of scientific validity
Results	Effectiveness, efficiency, satisfaction data	Quantitative evidence base
Findings & Recommendations	Root cause analysis, prioritized fixes	Actionable engineering output
Appendices	Raw data, consent forms, task scripts	Audit trail

A common pitfall: omitting the participant screening criteria. Without documenting how participants were screened (e.g., domain familiarity, technical literacy), the report cannot be used for regulatory submission or cross-study meta-analysis.

2. Three Pillars of Usability Metrics

The CIF organizes usability measurement around three pillars defined in ISO 9241-11: effectiveness, efficiency, and satisfaction. Effectiveness is captured through task completion rates, error counts, and help-system invocations. Efficiency is typically measured as time-on-task or clicks-per-task. Satisfaction is gathered via standardized questionnaires such as SUS (System Usability Scale), QUIS, or custom Likert scales.

An engineering insight worth emphasizing: the CIF does not prescribe a single metric for any pillar. Instead, it encourages evaluators to select metrics that are most sensitive to the product’s risk profile. For a medical device UI, error severity may outweigh time-on-task; for an e-commerce checkout flow, task-completion rate is king. Defining these priorities before data collection avoids post-hoc cherry-picking.

Recommended Metrics by Context

Use Context	Primary Metric	Secondary Metric	Minimum Sample
Medical device UI	Error severity (count × harm level)	Task completion	15–20 per segment
Consumer mobile app	Time-on-task	SUS score	12 per segment
Enterprise dashboard	Task completion rate	Clicks-per-task	8–10 per segment
Safety-critical HMI	Response time deviation	Error rate	20+ per segment

Using the CIF structure, a usability team reduced report creation time by 40% and increased cross-team readability scores by 2.1 points on a 5-point scale in a controlled trial at a major medical device manufacturer (Nielsen Norman Group case study, 2020).

3. Engineering Design Insights and Practical Application

Integrating the CIF into a CI/CD pipeline is an emerging practice. By instrumenting formative usability tests with automated logging (clickstreams, session recording, task-timing middleware), teams can generate CIF-compliant report drafts programmatically. The structured nature of the CIF makes it an excellent target for template-driven report generators.

Another key insight: the CIF’s recommendation section is the primary vehicle for driving design changes. Each recommendation should be tagged with a severity level (critical, major, minor) and linked to specific raw data points. This transforms the evaluation report from a mere record into a traceable requirements document that product owners and developers can act upon.

Never allow the executive summary to be written before the full analysis. The CIF warns that writing the summary first introduces confirmation bias — evaluators tend to seek data that supports the pre-written conclusion. Always analyze data, then summarize.

Frequently Asked Questions

Q: What is the difference between ISO/IEC 25066 and ISO/IEC 25062?
ISO/IEC 25062 specifically addresses usability test reports for measuring effectiveness, efficiency, and satisfaction in a controlled lab setting. ISO/IEC 25066 covers a broader range of evaluation methods — including field studies, expert reviews, and remote testing — and provides a generalized CIF that applies to any usability evaluation method.

Q: Can the CIF be used for agile UX sprints?
Yes. While the full CIF is designed for summative evaluations, teams can adopt a lightweight subset of sections (methodology, results, findings) for sprint-level formative testing. The key is maintaining consistency in metric definitions across sprints so that trend analysis remains valid.

Q: How many participants are required for a CIF-compliant report?
The CIF does not mandate a specific number. It requires you to justify the sample size based on the evaluation goals and the expected effect size. For formative studies, 5–8 participants per segment may suffice; for summative validation, 15–20 per segment is typical. Always cite your power analysis or the established heuristic (e.g., Nielsen’s 5-user rule for formative tests).

Q: Is the CIF applicable to hardware usability evaluations?
Absolutely. The CIF is method-agnostic and has been applied to medical devices, industrial control panels, automotive HMIs, and consumer electronics. The report structure remains the same; only the data collection instruments differ (e.g., video coding of physical interactions instead of clickstream logging).

📥 Standard Documents Download

🔒

Please wait 10 seconds, the download links will appear after the ad loads

No download files available yet