Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
ISO/IEC 26300-2 is Part 2 of the Open Document Format for Office Applications (ODF) v1.3 standard, specifying the package format that encapsulates ODF content, metadata, and associated resources into a single file. While Part 1 defines the XML content schema (the actual document structure expressed as XML files within the package), Part 2 defines how those XML files are assembled into a ZIP-based container with manifest entries, metadata streams, and digital signature support. Understanding the package layer is essential for any engineer working on document processing, archiving, or interoperability.
The ODF package format is built on the ZIP archive specification, with additional constraints that ensure predictable processing across implementations:
| Package Element | Required | Location | Purpose |
|---|---|---|---|
| mimetype entry | Yes | First entry in ZIP, no compression | Media type identification (e.g., application/vnd.oasis.opendocument.text) |
| META-INF/manifest.xml | Yes | Standard XML file in META-INF directory | File listing, media types, encryption info, and version metadata |
| META-INF/signatures.xml | No | Optional file in META-INF directory | XML digital signatures for document integrity and authentication |
| Content files | Yes | Root or subdirectories | content.xml, styles.xml, meta.xml, settings.xml, and embedded resources |
| META-INF/manifest.key | No | Optional file in META-INF directory | Encryption key derivation data for password-protected documents |
The mimetype entry is the most constrained element in the package. It must be the very first entry in the ZIP archive, stored without compression, and contain exactly the media type string followed by a single newline character (0x0A). This design allows file-type detection tools (e.g., Unix file command) to identify an ODF document by reading the first 50-80 bytes of the file without parsing the full ZIP structure — significantly improving detection performance on large document archives.
The manifest.xml file serves as the package’s table of contents. Every file in the package (except the mimetype entry itself) must be listed in the manifest along with its media type, and optionally, encryption information. The manifest also stores the ODF version and any namespace declarations needed for processing the package contents.
ISO/IEC 26300-2 specifies both envelope-based and manifest-based digital signature mechanisms. Envelope signatures cover the entire package as a binary blob, while manifest signatures cover individual files within the package, identified by their manifest entry paths.
Envelope signatures are simpler but less flexible: signing detects any modification to any file in the package. This is appropriate for archival use cases where document integrity is paramount. However, envelope signatures break when an application needs to add metadata (e.g., a print timestamp) to the package without invalidating the original signature.
Manifest-based signatures solve this flexibility problem by signing individual files independently. An application can add a new file to the package (e.g., an extended metadata stream) and sign only that new file, without affecting the signatures on existing content files. The manifest tracks which files are signed and which signature covers each file.
The digital signature format uses XML Signature Syntax and Processing (XML-DSig, W3C Recommendation), with support for X.509 certificates, HMAC-based keys, and referenced signature formats. ODF v1.3 added support for long-term validation (LTV) profiles, enabling signatures that remain verifiable decades after the signing certificate has expired — a critical requirement for electronic records management.
Building robust ODF processing tools requires careful attention to package-level details that are often overlooked by developers focused on the XML content:
Streaming vs. random-access processing: For large ODF packages (e.g., presentations with embedded video, or spreadsheets with thousands of images), random-access ZIP reading is essential. The manifest provides the byte offsets for each entry, enabling a reader to seek directly to a specific file without decompressing the entire package. When performance matters, use a ZIP implementation that supports entry-level random access rather than decompressing sequentially.
Encryption handling: ODF v1.3 supports both package-level encryption (the entire content is encrypted) and file-level encryption (individual files within the package are encrypted). File-level encryption is more flexible but requires careful key management. The manifest.xml stores the encryption algorithm, key derivation method (PBKDF2 with configurable iteration count), and initialization vector for each encrypted file. Implementations must support at least AES-128 in CBC mode, with SHA-256 as the hash algorithm for key derivation.
Backward compatibility: ODF v1.3 packages are designed to be readable by ODF v1.2 implementations, provided the v1.2 implementation ignores unknown elements in the manifest and content files. However, v1.3-specific features (new metadata elements, enhanced digital signatures, additional media types) are silently ignored by v1.2 processors. Applications should always declare the minimum ODF version in the manifest that covers the features they use.