Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
ISO/IEC 29341-17-12 specifies the AVTransport service, which is the central playback control component in the UPnP AV architecture. While ConnectionManager handles the establishment of media connections, AVTransport is responsible for controlling the actual playback of media content — managing the transport state machine, URI-based content selection, seeking, speed control, and track-based playback navigation. This service is the primary interface through which control points implement the familiar play-pause-stop-seek user experience.
The AVTransport service maintains a formal transport state machine that governs the device’s playback behavior. The core states are STOPPED, PLAYING, PAUSED_PLAYBACK, TRANSITIONING, and NO_MEDIA_PRESENT. Each state has well-defined legal transitions triggered by specific actions. For example, from STOPPED, the only valid transitions are to PLAYING (via Play) or to NO_MEDIA_PRESENT (if media is removed). From PLAYING, the transport can transition to PAUSED_PLAYBACK (via Pause), STOPPED (via Stop), or TRANSITIONING (when changing tracks with Next or Previous). Control points must respect these state machine semantics to ensure predictable device behavior.
The AVTransport service uses a URI-based content model. The control point sets the content to be played by calling SetAVTransportURI with a URI pointing to the media resource and a current URI metadata string (in DIDL-Lite format) describing the content. The service also supports a next URI via SetNextAVTransportURI, enabling gapless playback between consecutive tracks — the device buffers the next track while the current one is still playing, eliminating the silence gap between songs or video chapters.
Playback navigation actions include Next and Previous for track skipping, Seek for positional navigation within a track, and Play with a Speed parameter for variable-speed playback. The Seek action supports multiple unit types: TRACK_NR for seeking to a specific track in a multi-track resource, ABS_TIME for seeking to an absolute time position, REL_TIME for relative time seek, ABS_COUNT for frame-accurate seeking, and X_DLNA_REL_BYTE for byte-level seeking in DLNA-optimized scenarios. The GetPositionInfo action returns the current playback position in multiple unit formats simultaneously, allowing control points to display position information without additional conversion.
| Action | State Transition | Description | Common Error Codes |
|---|---|---|---|
| SetAVTransportURI | Any -> STOPPED (or NO_MEDIA_PRESENT cleared) | Set the URI of the media to be played | 716 (Seek Mode Not Supported), 718 (Illegal MIME-Type) |
| Play | STOPPED/PAUSED_PLAYBACK -> PLAYING | Start or resume playback at specified speed | 703 (Invalid State), 705 (No Media Present) |
| Pause | PLAYING -> PAUSED_PLAYBACK | Temporarily suspend playback | 703 (Invalid State) if not PLAYING |
| Stop | PLAYING/PAUSED_PLAYBACK -> STOPPED | Stop playback and reset position | 703 (Invalid State) if already STOPPED |
| Seek | PLAYING/PAUSED_PLAYBACK -> TRANSITIONING | Seek to specified position in the media | 716 (Seek Mode Not Supported), 717 (Illegal Seek Target) |
| Next | PLAYING/PAUSED_PLAYBACK/STOPPED -> TRANSITIONING | Skip to the next track | 712 (No Such Resource) if no next URI set |
| Previous | PLAYING/PAUSED_PLAYBACK/STOPPED -> TRANSITIONING | Go to previous track or restart current | 712 (No Such Resource) at first track |
| GetPositionInfo | Any (no state change) | Retrieve current playback position | None |
| GetTransportInfo | Any (no state change) | Retrieve transport state and status | None |
Implementing a robust AVTransport service requires handling several asynchronous complexities. The most significant is the management of the relationship between the transport state machine and the underlying media decoder pipeline. Decoder initialization, buffer pre-rolling, and audio-video synchronization all take real time, and the device must reflect these phases accurately through the state variables. For example, after SetAVTransportURI, the device should set CurrentTransportState to STOPPED while buffering, then transition to PLAYING when Play is invoked and sufficient data has been buffered.
Multiple AVTransport instances (identified by AVTransportID) allow a single device to manage several independent playback sessions simultaneously. Each instance has its own state machine, URI, position, and transport settings. This is essential for multi-room audio systems, picture-in-picture video, or recording devices that need to monitor playback while recording. The ConnectionManager’s PrepareForConnection action associates a connection with a specific AVTransport instance, and the control point includes the AVTransportID in all subsequent AVTransport action invocations.
The Play action’s Speed parameter deserves careful engineering attention. The standard specifies that Speed=1 denotes normal playback, values greater than 1 indicate fast forward (e.g., 2, 4, 8, 16, 32), and values between 0 and 1 (exclusive) indicate slow motion. Negative values indicate reverse playback. The device must advertise which speeds it supports via the TransportPlaySpeed state variable, and control points must query this before attempting non-normal-speed playback. Engineers should implement speed transitions gracefully, maintaining audio-video synchronization at non-standard speeds whenever the decoder pipeline supports it.
Audio synchronization (lip sync) is a subtle but critical implementation detail. When the AVTransport service manages both audio and video streams, it must ensure that the audio output remains synchronized with the video frames. The standard defines an AVSyncOffset state variable that allows control points to adjust the synchronization offset (in milliseconds) to compensate for varying processing delays in the audio and video paths. A positive offset delays audio relative to video; a negative offset advances audio.