Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
IEC TS 62592, Edition 2.0 (2012-07), is a Technical Specification that provides encoding guidelines for portable multimedia consumer electronics (CE) products using the MP4 file format with AVC (H.264) video coding and AAC audio coding. Built upon the foundation of ISO/IEC 14496-12, ISO/IEC 14496-14, and ISO/IEC 14496-15, this specification addresses the critical engineering challenge of achieving global interoperability across portable devices with limited resources — constrained processing power, storage capacity, and battery life.
IEC 62592 specifies operational rules and extensions for the MP4 file format in portable applications. The core of the specification defines four design rules: operational rules for MP4 file format (including box/field settings and box ordering); extensions to MP4 file format (improved file identification and metadata handling); operational rules for media data and track structure (combinations of audio and video encoding); and other operational rules for interoperability (decoder capabilities and recommended recording modes).
The file structure is based on the ISO Base Media File Format (ISOBMFF), but IEC 62592 constrains and extends it. The specification defines the precise usage of brand identifiers in the file type box (ftyp) — portable players shall recognize and correctly respond to brands ‘mp42’, ‘isom’, and ‘avc1’. Files must contain a moov box (storing metadata) and one or more mdat boxes (holding actual audio/video sample data). For streaming applications, the specification also defines moof/mfra structures for random access information.
The AVC (H.264) video layer is tightly constrained in IEC 62592 to match portable device capabilities. The specification limits Level values (typically up to 3.0 or 3.1 depending on target resolution) and Profile (Baseline, Constrained Baseline, or Main Profile). Supported resolutions are scoped to: QVGA (320×240), VGA (640×480), SVGA (800×480), WVGA (800×480), and 720p (1280×720). Frame rates are capped at 30 fps, with bitrate limits explicitly defined per Level and target resolution.
| Parameter | Value / Range | Constraint Rationale |
|---|---|---|
| Video Codec | AVC (H.264) | Widespread hardware decoding support |
| Profile | Constrained Baseline / Main | Reduced decoding complexity |
| Maximum Level | 3.0 (VGA) / 3.1 (720p) | Limits macroblock processing rate |
| Max Resolution | 1280 × 720 (720p) | Typical portable screen upper bound |
| Max Frame Rate | 30 fps | Balance smoothness and complexity |
| Video Bitrate | 500 kbps to 5 Mbps | Storage and bandwidth optimization |
| GOP Structure | Closed GOP, IDR ≤ 2 sec interval | Enable random access and trick play |
| Reference Frames | Up to 4 frames | Limit decoder buffer requirements |
The AAC audio layer defines three supported coding formats: AAC-LC (Low Complexity), HE-AAC (High Efficiency AAC, i.e., AAC LC + SBR), and HE-AAC v2 (AAC LC + SBR + PS). Sampling rates range from 16 kHz to 48 kHz, with channel configurations supporting mono (1.0) and stereo (2.0). The specification restricts audio bitrates between 48 kbps and 256 kbps, depending on the target audio quality level and encoding format.
Synchronization between audio and video is handled through the timestamp mechanism within the MP4 container. Each sample is associated with a decoding timestamp (DTS) and composition timestamp (CTS), with the timescale and sample_duration fields defining the precise time axis. IEC 62592 requires that audio and video tracks start at the same time (aligned start) with no more than 10 frames of audio pre-roll, ensuring no perceptible lip-sync errors exist at playback initiation.
The specification defines extended metadata handling, including metadata such as title, artist, album, and track number embedded through standard box structures. File naming for portable CE products follows specific conventions to ensure devices correctly identify and support file contents. The specification also defines metadata fields for date, language (via ISO 639-2 codes), and copyright information.
Implementing an IEC 62592-compliant encoder requires careful attention to several ISO file format details. Track references must be correctly set — if B-frames are present, the video track’s edit list (elst) must provide correct time mapping; otherwise, players may show undecoded frames during seek operations. Audio tracks must have correct channel layout and sample format settings to ensure proper multi-channel rendering.
Bitstream conformance is critical for compatibility. The Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) for AVC video must be correctly placed in the avcC box and must be consistent with the actual encoded bitstream. The specification requires that certain fields in the SPS (such as pic_order_cnt_type and max_num_ref_frames) strictly adhere to the constrained values.
Testing and verification is an area of significant focus in IEC 62592. To simplify interoperability testing, the specification provides a file conformance checklist. Testing encoded output with the IEC 62592 reference decoder (or an equivalent commercial product) should be a mandatory step in the product development workflow. Even for compliant encoders, it is recommended to perform actual playback verification on at least two different target devices from different manufacturers.
IEC 62592 directly references ISO/IEC 14496-10 (AVC video), ISO/IEC 14496-3 (AAC audio), ISO/IEC 14496-12 (ISO Base Media File Format), ISO/IEC 14496-14 (MP4 File Format), and ISO/IEC 14496-15 (AVC File Format). Together, these standards form the technical foundation for portable multimedia encoding.
Not entirely. Despite significant increases in processing power, portable devices face new constraints including power/thermal limitations, thinner form factors, and cost optimization. The IEC 62592 parameter sets were carefully chosen to balance file size, quality, decoding complexity, and battery life — all core engineering considerations that remain relevant for any portable product.
IEC 62592 Edition 2.0 was published in 2012, predating the widespread adoption of mainstream HDR video standards. Recent AVC specifications support HDR extensions, but IEC 62592 itself does not address HDR metadata handling. For HDR portable playback, refer to later editions or supplementary industry specifications.
The specification defines upper bounds and constraints, but actual quality depends on rate control implementation. Two-pass variable bitrate (VBR) is recommended — the first pass analyzes content complexity, the second performs optimal bit allocation within the IEC 62592-specified bitrate caps. For most portable media scenarios, a constant rate factor (CRF) setting between 23 and 28 provides good quality-to-size balance.