Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
ISO/IEC 29500-4 defines the Transitional Migration Features of the Office Open XML format family — a critical bridge between the legacy binary Office formats (DOC, XLS, PPT) and the modern OOXML standard. While Parts 1-3 define the pure OOXML architecture, Part 4 specifies how elements and attributes from the binary format world map onto OOXML constructs, enabling lossless conversion of billions of existing documents. For engineers building document converters, migration tools, or compatibility layers, Part 4 is the indispensable reference that explains why certain OOXML constructs exist and how to handle the corner cases that arise during format translation.
Part 4 provides detailed mapping rules for every significant binary format feature. These mappings are not merely syntactic — they define semantic equivalences that preserve the visual appearance, layout behavior, and application-level semantics of the original document. The following table illustrates key mapping categories.
| Binary Feature | OOXML Transitional Mapping | Strict Equivalent? |
|---|---|---|
| Auto-numbering fields (LISTNUM) | w:numPr with abstract numbering definitions | Yes — identical in Strict |
| Word 97-2003 document protection | w:documentProtection with legacy hash algorithm | No — replaced by modern cryptographic protection |
| Drawing objects (VML) | w:pictureData and v: (VML) namespace elements | No — replaced by DrawingML |
| OLE objects & ActiveX controls | r:oleObject and r:control with clsid attributes | Partial — clsid preserved; runtime behavior deprecated |
| Excel multi-sheet selections | x:sheetWindows in workbook.xml | Yes — identical |
| PowerPoint binary placeholders | p:ph with legacy index attributes | No — replaced by placeholder type enumeration |
| Embedded fonts (EOT) | w:embedFont with r:id to font part | Yes — identical |
The mapping rules in Part 4 are normative — compliant converters MUST produce the specified OOXML output for each recognized binary feature. This normative status is what distinguishes a correct converter from a heuristic one: the standard defines the ground truth for format translation.
A central design goal of Part 4 is round-trip fidelity: saving a document in OOXML and re-opening in a legacy binary application should produce a result that is functionally equivalent (if not pixel-identical) to the original. Achieving this requires preserving legacy-specific data alongside native OOXML content, a strategy known as “parallel markup.”
Parallel markup is most visible in the handling of AutoShapes, text boxes, and WordArt. In Transitional documents, these features are represented twice: once as DrawingML (for OOXML-native applications) and once as VML (for binary-format applications). The mc:AlternateContent mechanism (Part 3) selects the appropriate representation based on the processor’s capabilities.
| Feature | Transitional Representation | Round-Trip Strategy |
|---|---|---|
| WordArt (text effects) | DrawingML + VML parallel markup | mc:AlternateContent — DrawingML primary, VML fallback |
| Text boxes (legacy) | v:textbox in VML + a:xfrm in DrawingML | Dual emission: processor selects based on capability |
| Chart formatting | c:chart with DrawingML styling | Single representation; binary chart styles remapped |
| Equation objects | m:oMath (MML) + legacy OLE equation | mc:AlternateContent — MML primary, OLE fallback |
| Form controls | w:fldData + w:ffData with binary field codes | Legacy field codes preserved for backward compatibility |
Part 4 explicitly marks certain features as “transitional only” — they are retained for backward compatibility but deprecated for new development. These include VML graphics, legacy field codes (e.g., w:fldCode with PRIVATE), binary document protection algorithms (MD2, MD4, SHA1-42), and the w:subDoc element for master/subdocument relationships.
For engineering teams, the practical implication is that Transitional reading code must support a superset of OOXML elements, while Transitional writing code should prefer the modern equivalent whenever possible. The standard provides deprecation annotations that guide implementors toward future-proof choices without breaking existing content.
Building a robust binary-to-OOXML converter requires careful navigation of Part 4’s mapping tables. The standard organizes mappings by feature category and provides conformance criteria for each. A practical engineering approach is to implement converters in three tiers:
Tier 1 (Core): Paragraphs, runs, tables, lists, sections, headers/footers, images, hyperlinks — the features that cover 95% of real-world documents. These mappings are well-defined and relatively stable across binary format versions.
Tier 2 (Extended): Tracked changes, comments, bookmarks, fields (with field code preservation), mail merge, embedded objects, charts. These require deeper parsing of the binary format’s complex record structures.
Tier 3 (Legacy): VML drawings, OLE objects, ActiveX controls, legacy forms, macro-enabled documents (DOCM). These are the primary source of conversion failures and require fallback strategies when direct mapping is not feasible.
A: No — Word 2003 does not support OOXML natively. However, a Transitional-conformance document can be opened by Word 2003 with the Microsoft Office Compatibility Pack installed. Strict documents require Office 2013 or later (or any ODF-supporting application with OOXML import).
A: No — Strict conformance explicitly prohibits Transitional-only elements and attributes. Validators that check for Strict conformance (such as the OOXML Conformance Test Suite) will flag any Transitional-only markup as a conformance failure.
A: VBA macros are stored in a separate project part within the OOXML package (vbaProject.bin). Part 4 does not define the macro format itself — it only specifies how the macro project is packaged and referenced via relationships. The actual macro binary format remains unchanged from the legacy Office format. This is why DOCX → DOCM conversion can preserve macros without recompilation.
A: The standard recommends a phased approach: (1) batch-convert documents to Transitional OOXML with mc:AlternateContent parallel markup for critical features, (2) validate conversion fidelity using automated comparison tools, (3) transition authoring workflows to Strict OOXML for new documents, and (4) archive transitional documents with conversion manifests for auditability. Phase 2 is the most resource-intensive and should be prioritized for high-value document collections.