Understanding CAN CSA Z243.302-91 (R1999) and CAN CGSB-200.9-91: Character Code Structure and Extension Techniques for Data Interchange

A comprehensive guide to the Canadian standard for character code extension, escape sequences, and coded character set designation

Scope and Purpose

CAN CSA Z243.302-91 (R1999) and CAN CGSB-200.9-91 together define the Canadian national standard for character code structure and code extension techniques used in data interchange. This dual-standard document, reaffirmed in 1999, is technically equivalent to ISO 2022:1994 and establishes a framework for representing and switching among multiple coded character sets within a single data stream. It is widely adopted in Canadian government, telecommunications, and information processing environments to ensure consistent handling of text in English, French, and other languages requiring accented characters and special symbols.

The standard specifies:

  • The structure of 8-bit character codes derived from a 7-bit base (ISO 646 IA5).
  • Escape sequences for designating and invoking character sets into four graphic registers (G0 through G3).
  • Control functions for shifting among registers (LS0, LS1, LS2, SS2, etc.).
  • Rules for code extension to accommodate multiple character sets, including multibyte sets and Kanji.
  • Mandatory escape sequence formats and their registration with Canadian registries.
Key Point: This standard is the backbone of legacy character encoding infrastructure in Canada. Even though Unicode adoption has grown, many Canadian government systems still rely on these code extension mechanisms for backward compatibility.

Technical Requirements and Code Extension Mechanisms

Structure of the 8-Bit Code

The standard defines a 9-track structure: 8 data bits plus parity. The 8-bit code table is divided into four columns: 0–3 for control functions (C0 and C1) and 4–7 for graphic characters. Code extension allows a sender to change the graphic characters available in each column dynamically.

Escape Sequences

Escape sequences begin with the ESC control character (1/11) followed by one or more intermediate bytes and a final byte. The sequence identifies the character set, its type (94-character set, 96-character set, multibyte, etc.), and the graphic register it is designated to (G0, G1, G2, G3).

Table 1 — Common Escape Sequences for Designation and Invocation
Function Escape Sequence (Hexadecimal) Effect
Designate G0 with ISO 646 IRV ESC 2/0 4/0 Selects the ASCII graphic set for G0.
Designate G1 with supplementary Latin ESC 2/1 4/0 Selects a Latin supplement (e.g., Canadian French characters) for G1.
Designate G2 with a 96-character set ESC 2/2 4/3 … Selects a 96-character set (e.g., Greek) for G2.
Shift to G1 (LS1) SO (0x0E) Invokes the graphic set designated to G1.
Shift to G0 (LS0) SI (0x0F) Returns to the graphic set designated to G0.
Single shift to G2 (SS2) ESC 4/14 (0x19) or 0x8E in 8-bit mode Temporarily invokes the G2 set for the next character only.
Single shift to G3 (SS3) ESC 4/15 (0x1D) or 0x8F in 8-bit mode Temporarily invokes the G3 set for the next character only.

Designation vs. Invocation

Designation binds a character set to one of the four graphic registers using an escape sequence. Invocation (shift functions) tells the receiving system which register to use for the next character(s). This separation allows a system to switch among many character sets without re-sending long escape sequences.

Caution: Improper nesting of escape sequences or ambiguous shifts can cause loss of synchronization between sender and receiver. Implementations must strictly follow the state machine described in Clause 6 of CAN CSA Z243.302-91.

Implementation Considerations and Practical Applications

The standard has been widely implemented in:

  • Canadian government messaging systems (e.g., Canadian Forces message handling).
  • Terminal emulators and printer drivers for bilingual text (English/French).
  • Legacy database export/import formats that rely on code extension rather than Unicode.
  • Electronic data interchange (EDI) specifications that reference the standard for character repertoire.

Interaction with Modern Encodings

While Unicode (UTF-8) is now preferred for new systems, the standard remains relevant for data migration and interoperability with older equipment. Many Canadian data archives contain streams encoded per this standard, and decoding tools must handle the escape sequences correctly. The standard also defines a mechanism for embedding character set identification that can be mapped to ISO/IEC 10646 (UCS).

Best Practice: When designing a new system that must interoperate with existing Canadian government systems, support for the escape sequences and shift functions defined in CAN CSA Z243.302-91 is essential. Provide a configuration option to enable or disable code extension processing.

Compliance and Certification Notes

CAN CSA Z243.302-91 (R1999) and CAN CGSB-200.9-91 are mandatory for federal government procurements that require character code interchange. Compliance is verified by testing against the escape sequence registry maintained by the Canadian General Standards Board (CGSB).

  • Conformance classes: Implementations must declare which optional features (e.g., multiple shift functions, non-ISO/IEC 646 sets) are supported.
  • Registration: Character sets used with the standard must be registered with CGSB to ensure unique escape sequences.
  • Interoperability testing: The standard recommends round-trip testing for escape sequence interpretation and state machine transitions.
  • Updating: Although the standard was reaffirmed in 1999, it is still referenced in several Canadian federal standards (e.g., Treasury Board directives on data encoding).
Critical: In 1999, the CGSB amended certain escape sequences to align with ISO 2022:1994. Systems claiming compliance must implement the amendment; failure to do so may result in security vulnerabilities where escape sequences are misinterpreted.

For current certification, refer to CGSB data communication standards and the ISO 2022 registry at the ISO/IEC JTC1/SC2. The Canadian adoption adds specific provisions for French language accents and the Canadian Aboriginal Syllabics.

Q: What is the relationship between CAN CSA Z243.302-91 (R1999) and ISO 2022?
A: CAN CSA Z243.302-91 is the Canadian national adoption of ISO 2022:1994. The CGSB-200.9-91 designation adds Canadian-specific requirements, such as registration procedures and character sets for French and Aboriginal languages. The two documents are used together.
Q: Is this standard still relevant now that Unicode is dominant?
A: Yes, for legacy systems, data archives, and government applications that have not migrated to Unicode. Also, understanding the code extension mechanisms helps in converting older data to modern encodings. The standard is also referenced in some industry-specific EDI formats.
Q: Does the standard mandate a specific character set?
A: No. It defines a framework for using multiple character sets. The base set is usually the ISO 646 IRV (ASCII), but other sets can be designated. The Canadian registration includes mandatory sets for English, French, and optional sets for other languages.
Q: Are there security concerns with code extension techniques?
A: Yes. Malformed or malicious escape sequences can change the interpretation of subsequent characters. Implementations should validate escape sequences against a whitelist and use state machines that follow the standard strictly. The CAN CSA Z243.302-91 includes security considerations in its appendix.

This article is prepared for informational purposes and reflects the technical content of CAN CSA Z243.302-91 (R1999) and CAN CGSB-200.9-91 as understood at time of writing (2026). Always consult the latest official documents from the Canadian General Standards Board and the Canadian Standards Association for full compliance requirements.

📥 Standard Documents Download

🔒
Please wait 10 seconds, the download links will appear after the ad loads

Leave a Reply

Your email address will not be published. Required fields are marked *