IEC 62731-2018: Text-to-Speech for Television General Requirements

Making TV Accessible for Visually Impaired Users Through Standardized TTS Functionality

1. Introduction and Scope

IEC 62731-2018, prepared by IEC TC 100 (Audio, video and multimedia systems and equipment), specifies the text-to-speech (TTS) functionality for digital television receivers. This second edition supersedes the 2013 edition and adds significant improvements, including network-based updates to TTS pronunciation dictionaries and enhanced announcement quality levels. The standard applies to stationary and semi-stationary digital TV receivers such as set-top boxes, integrated digital TVs, and recorders whose primary function is TV reception. It does NOT apply to products where TV is a secondary function, such as PCs or game consoles with TV tuners.

Tip: The standard recognizes two possible system architectures: a single device with integrated TTS generation, or a two-device configuration where the TV interfaces with an external TTS device. This flexibility allows manufacturers to implement cost-effective solutions using existing external accessibility aids.

2. User Requirements for Visually Impaired Viewers

2.1 Core User Needs

The standard identifies five key areas of user requirements based on research with visually impaired television viewers:

  • Navigating channels — announcing channel names, numbers, and current program information
  • Navigating TV inputs — identifying active HDMI inputs, AV sources, and connected devices
  • Additional data services — reading teletext, subtitles, program guides (EPG), and interactive content
  • Operating the TV — providing audible feedback for menu navigation, volume control, and settings adjustments
  • TV use — managing recordings, pause/live TV functions, and catch-up services

2.2 Context-Aware Announcements

A key innovation in the standard is the concept of context-aware TTS. The TV system recognizes which context the user is in (watching TV, browsing EPG, adjusting settings) and provides appropriate audio feedback. For example, when changing channels, the TTS should announce the channel name, program title, and start/end time, while in the EPG context, it should announce program descriptions and scheduling information.

Important: The standard specifies that priority audio information (e.g., emergency alerts) must interrupt normal TTS output immediately. This ensures that visually impaired users receive critical safety information without delay.

3. TTS Profiles and Functional Requirements

IEC 62731 defines three distinct TTS profiles with increasing levels of functionality:

Profile Level Features Target Use Case
Basic Entry Channel change announcement, volume indication, simple menu navigation Low-cost receivers, basic accessibility
Main Standard Full EPG reading, program information, subtitle rendering, input switching Mainstream TVs with accessibility focus
Enhanced Advanced All Main features plus interactive services, smart TV apps, advanced navigation Premium smart TVs, comprehensive accessibility

3.1 Functional Requirements for TTS Device/Engine

The standard specifies that the TTS engine must:

  • Support the language of the user interface and broadcast content
  • Provide adjustable speech rate (typically 50-200% of normal speed)
  • Support voice selection (male/female where available)
  • Handle numbers, abbreviations, and acronyms correctly
  • Provide word-by-word or character-by-character reading for text input fields
Engineering Insight: The 2018 edition introduces a mechanism for the TV to receive updated pronunciation dictionaries and conversion rules via network connection. This is a significant improvement because it allows correction of incorrectly pronounced proper names, brand names, and foreign-language terms without firmware updates. Engineers should design the TTS system with a network-updatable pronunciation database and version management.

4. TV Events and TTS Data Mapping

The standard defines a comprehensive mapping between TV events and the corresponding TTS announcement data. The event state machine includes: channel change, program start/end, EPG navigation, menu selection, pop-up messages, input source switching, and context switches. For each event, the standard specifies exactly which information elements must be included in the TTS output.

Critical Implementation Note: One of the most challenging aspects of TTS integration is handling the context switch — when the user moves between different TV functions (e.g., from live TV to the EPG). The standard requires that the TTS announce the new context clearly before providing detailed information. Engineers must implement a state machine that tracks the current context and filters TTS data accordingly to avoid information overload.

5. Frequently Asked Questions

Q1: Does IEC 62731 apply to streaming-only devices or just broadcast TV?

A: The standard applies to receivers whose primary function is to receive TV content. While the focus is on broadcast digital TV, the TTS principles and profiles are applicable to streaming and hybrid devices. However, PCs and game consoles with secondary TV capability are explicitly excluded.

Q2: What languages must the TTS system support?

A: The standard requires support for the language of the user interface and broadcast content but does not mandate specific languages. In practice, manufacturers implement TTS for the languages used in their target markets. The network-updateable pronunciation feature introduced in the 2018 edition helps address multilingual content.

Q3: How is subtitle/closed caption TTS handled?

A: Subtitles (closed captions) are treated as TTS data that must be read when the subtitle display is active. The TTS can either read subtitles as they appear or on demand. The standard includes provisions for handling subtitle format transitions, multiple subtitle tracks, and synchronization with program audio.

Q4: What is the difference between TTS audio and priority audio information?

A: TTS audio is the normal speech output generated from on-screen text. Priority audio information relates to emergency alerts and critical system messages that must interrupt normal TTS output. The standard mandates that priority audio takes precedence over all other TTS output and cannot be suppressed by user settings.

Leave a Reply

Your email address will not be published. Required fields are marked *