Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
The standard CAN CSA ISO/IEC 13250-04 is the Canadian adoption of ISO/IEC 13250-4, part of the multi-part standard Information technology — Topic Maps. This part defines Topic Maps Canonicalization (TMCL), a deterministic transformation that converts a Topic Maps document (typically serialized in XML Topic Maps (XTM) or Canonical XML) into a canonical form. The canonical form enables reliable digital signatures, hash verification, and lossless interchange between disparate systems by eliminating syntactic variances while preserving semantics.
| Part | Standard | Focus |
|---|---|---|
| 1 | ISO/IEC 13250-1 | Overview and basic concepts |
| 2 | ISO/IEC 13250-2 | Data model |
| 3 | ISO/IEC 13250-3 | XML syntax (XTM) |
| 4 | ISO/IEC 13250-4 | Canonicalization (this standard) |
| 5 | ISO/IEC 13250-5 | Published subjects |
| 6 | ISO/IEC 13250-6 | Compact syntax (CTM) |
The scope of Part 4 specifically addresses the creation of a single, repeatable serialization for any valid Topic Maps dataset, regardless of the original syntax (XTM, CTM, Canonical XML, or others). This is achieved by applying a set of normalization rules that standardize whitespace, attribute ordering, namespace prefixes, entity references, and optional elements. The output is a unique byte stream that can be fed into a hash function (e.g., SHA-256) to produce a signature or fingerprint that is independent of authoring tool or encoding variations.
CAN CSA ISO/IEC 13250-04 prescribes a series of mandatory transformations that any conforming processor must apply to input Topic Maps data. The key requirements are organized into the following categories:
All XML content must be well-formed. Namespace declarations are serialized in a canonical order, and only the default prefix (tm or xtm) and known well-defined prefixes are retained. Redundant declarations are removed. The standard mandates that the XML declaration <?xml version="1.0" encoding="UTF-8"?> be present, and the document element must use the canonical namespace URI.
Attributes of each element are sorted lexicographically (by namespace URI + local name). CDATA sections are converted to normal character data with proper escaping. Whitespace inside attributes is normalized according to the xml:space rules, and empty elements are expanded to start/end tags. Optional elements such as <subjectIndicatorRef> are always written in a consistent form.
Internal identifiers (e.g., id attributes) are not changed, but references via @href or @source must be resolved to absolute URIs if relative. The standard requires that subject identifiers be expressed as absolute IRIs. Fragment identifiers are preserved as-is.
The canonical output orders topics alphabetically by their subject identifier (or subject locator), then by internal ID as a tie-breaker. Associations are sorted first by type, then by role player identifiers. Occurrences follow a comparable deterministic order. This ensures that two identical topic maps always produce the same byte sequence, regardless of the original document order.
| Normalization Step | Input Example | Canonical Output (simplified) |
|---|---|---|
| Attribute order | <xtm:topic id="t1" xmlns:xtm="..."> | <xtm:topic xmlns:xtm="..." id="t1"> |
| Namespace prefixes | <my:topic xmlns:my="..."> | <xtm:topic xmlns:xtm="..."> |
| Whitespace trimming | <baseName> Hello </baseName> | <baseName>Hello</baseName> |
To implement CAN CSA ISO/IEC 13250-04 in software, the following architectural approach is recommended:
Because the canonical form is used primarily for digital signatures and data integrity checks, performance may be critical. Implementations should stream output where possible, avoid building large strings, and use efficient I/O for large topic maps (millions of topics). Memory-mapped files and incremental hashing (e.g., MessageDigest.update()) are recommended.
To claim compliance with CAN CSA ISO/IEC 13250-04, an implementation must pass the conformance test suite defined in the normative annexes. The key checks include:
Validation approach: Use a reference implementation (e.g., the Java-based canonicalizer provided in the ISO/IEC 13250-4 test suite) to generate a baseline hash. Compare your implementation’s output digest against this baseline. Many standards bodies offer free downloadable test vectors.
© 2026 Technical Standards Digest. All rights reserved. For informational purposes only; always refer to the official CAN CSA and ISO/IEC documents for complete specifications.