ISO/IEC 14496-15:2016 – Carriage of NAL Unit Structured Video in ISO Base Media File Format

A comprehensive technical guide to storing AVC, HEVC, and scalable video bitstreams within MP4 containers

ISO/IEC 14496-15:2016 (commonly referenced as IEC 14496-15-15) is the third edition of Part 15 of the MPEG‑4 suite. It specifies how video bitstreams built on the network abstraction layer (NAL) unit concept – notably H.264/AVC, H.265/HEVC, and their scalable and multiview extensions – are stored in the ISO base media file format (ISOBMFF). This article provides an in‑depth look at the standard’s scope, technical requirements, implementation highlights, and compliance notes.

Scope and Purpose

ISO/IEC 14496-15:2016 defines the storage syntax and semantics for NAL‑unit‑based compressed video within the family of ISO media files (e.g., MP4, M4V, 3GP). It addresses the need for a container that preserves the logical structure of the video stream while enabling efficient random access, streaming, and editing. The standard covers:

  • Sample entries for AVC, SVC, MVC, HEVC, SHVC, and MV‑HEVC.
  • Configuration records that carry sequence and picture parameter sets (SPS/PPS/VPS).
  • Mapping of NAL units to samples and sample groups for temporal, spatial, and quality scalability.
  • Support for multiple coded video sequences in a single track.

The 2016 edition extends the prior version by formally integrating HEVC and its scalable extensions, cleaning up previously ambiguous mapping rules, and adding provisions for high‑dynamic‑range (HDR) video metadata.

Technical Requirements

Every video track conforming to this standard must contain a sample entry that unambiguously identifies the codec and its configuration. The central data structure is the DecoderConfigurationRecord, which stores the initialization information needed to decode the attached video stream.

CodecSample Entry (4CC)Configuration RecordRequired NAL Unit Types
H.264/AVCavc1AVCDecoderConfigurationRecordSPS, PPS, (IDR) slices
Scalable AVC (SVC)avc1 or svc1AVCDecoderConfigurationRecord (with SVC extension)SPS, PPS, sub‑bitstream SPS
Multiview AVC (MVC)mvc1AVCDecoderConfigurationRecord (with MVC extension)SPS, PPS, view‑level parameter sets
HEVC (H.265)hev1 / hvc1HEVCDecoderConfigurationRecordVPS, SPS, PPS, slices
Scalable HEVC (SHVC)hev1 / hvc1 (or shv1)HEVCDecoderConfigurationRecord (with scalability info)VPS, SPS, PPS, layer‑specific parameter sets
Multiview HEVC (MV‑HEVC)mhv1HEVCDecoderConfigurationRecord (with view info)VPS, SPS, PPS, view parameter sets

The table above summarises the principal sample entries and the configuration records that must be present in the stbl box of the media track. In all cases, the configuration record shall be placed in the avcC or hvcC box, as appropriate. Implementations must ensure that all parameter sets used by the bitstream are included in the configuration record; missing parameter sets cause decoding failure.

Implementation Highlights

Tip: When writing an ISOBMFF muxer for HEVC, use the hvc1 sample entry rather than hev1 to indicate that parameter sets may need to be retrieved from the configuration record. For broad compatibility, many players expect hvc1 for self‑contained files and hev1 for streaming scenarios.

Key implementation considerations include:

  • NAL unit length field: Depends on the nalUnitLength field in the configuration record (typically 4 bytes).
  • Sample grouping: Use the tscl group type to label temporal layers; this aids a player in selecting a sub‑frame for trick modes.
  • Parameter set redundancy: The standard allows in‑band parameter sets (inside samples). Implementations should be prepared to update configuration on the fly.
  • Track rendering priority: For scalable codecs, the trak order in the moov box indicates base and enhancement layers.
Caution: Interleaving samples from multiple layers incorrectly can cause buffer underflow or decoder reset. Always sequence samples in decoding order and use the ctts box for presentation timestamps.

Compliance Notes

To claim conformance with ISO/IEC 14496-15:2016, a file or reader must satisfy a set of interoperability conditions:

  • Box ordering: The ftyp box must declare a compatible brand (e.g., mp42, isom, iso5, or iso6).
  • Configuration record correctness: The AVCDecoderConfigurationRecord or HEVCDecoderConfigurationRecord must contain the exact SPS/PPS/VPS used by the video samples. Corrupted or outdated parameter sets are a leading cause of playback failure.
  • Sample types: All NAL units that belong to a given video frame must be placed in a single sample (unless sample grouping with non‑sync samples is used).
  • Codec brand in ftyp: For HEVC tracks, the recommended brand is hevc, while for AVC it is avc1 or mp42.
Conformance testing: The conformance software available from MPEG (e.g., the ISO/IEC 14496-5 reference software) can validate the structure of an ISOBMFF file against Part 15 requirements. Many commercial media analyzers also check these rules.
Common pitfall: Some encoders produce avcC boxes with spurious “extra” bytes. The parsing algorithm defined in ISO/IEC 14496-15:2016 is strict about the number of sequences and length fields; extra bytes or incorrect ordering of SPS and PPS may be rejected by compliant decoders.

Frequently Asked Questions

Q: Does ISO/IEC 14496-15:2016 also support AV1 or VP9?
A: No, this standard is exclusively for NAL‑unit‑based codec families (AVC/HEVC and their extensions). AV1 and VP9 are handled under separate specifications (e.g., AV1 in ISOBMFF is defined by the Alliance for Open Media).
Q: What is the difference between hev1 and hvc1 sample entries?
A: In hvc1, all parameter sets needed for decoding are expected to be present in the hvcC configuration record. In hev1, some parameter sets may be only sent in‑band (within the sample data). For offline files, hvc1 is recommended; for live streaming, hev1 is more common.
Q: Can a single MP4 file contain both AVC and HEVC tracks?
A: Yes, the ISOBMFF supports multiple tracks. However, each track must comply with the sample entry rules for its respective codec. The ftyp brand should reflect the most advanced codec used in the file.
Q: Is the 2016 edition backwards compatible with files created under the 2014 edition?
A: Generally yes, because the new edition mainly adds features (e.g., clarified handling of SHVC and MV‑HEVC). Exceptions involve deprecated fields in the HEVCDecoderConfigurationRecord (e.g., the parallelismType field was removed). Old files may not trigger the new rules, but an updated parser must handle both.

📥 Standard Documents Download

🔒
Please wait 10 seconds, the download links will appear after the ad loads

Leave a Reply

Your email address will not be published. Required fields are marked *