Skip to main content

An Important Step Toward the Implementation of IDN Top-Level Domains: New Versions of IDNA Protocol Revision Proposals Posted

Revised papers of the technical standards that define the implementation of IDNs — the standards are called IDNA — were recently released via the IETF editors. These revisions are an important step toward the delegation of IDN top-level domains.

An informal expert panel, working as what the IETF calls a "design team," evaluated experiences gained in the implementation of the previous version of IDNA since its introduction in 2003, and identified several key areas of future work. These key areas were described in several documents that triggered a formal revision of the IDNA protocol. New versions of the internet-drafts proposing the revisions to the IDNA protocol have been released and are as follows:

The core components in the revision effort include:

  • definition of valid IDN labels, an inclusion-based model that clearly defines which characters will be available for IDNs (the current model is exclusion-based), and recognizes the implications of the Unicode handling of various scripts on use in IDNs;
  • elimination of confusing and non-reversible character mappings;
  • fixing an error in right-to-left error in Stringprep (a profile used to prepare IDNs in the IDNA protocol ), and eliminating Unicode version dependencies, thereby permitting more scripts to be used in IDNs now and in the future.

The issues with the current IDN model that led to these revisions are discussed in RFC4690. This RFC discusses specific character issues where the same script is used in different languages, issues related to cases where languages can be expressed by using more than one script, issues involving bi-directional cases, and issues concerning the topic of visually confusing characters.

ICANN urges the Internet technical community to take part in the final stages of this development work by reviewing this documentations and constructively commenting on it. The review is moving forward in accordance with standard IETF processes. The intention is to finalize this work and publish the new IDNA standard within the next two months.

More Announcements
Domain Name System
Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as"""" is not an IDN."