Skip to main content
Resources

Guidelines for the Implementation of Internationalized Domain Names | Version 1.0

Note: These Guidelines have been developed collaboratively by ICANN and leading Internationaized Domain Names (IDN) registries. Version 1.0 of these Guidelines was published on 20 June 2003, coinciding with the launch of deployment of IDNs under the IETF's Proposed Standard reflected in RFCs 3490, 3491, and 3492.

The implementation approach set forth in these Guidelines was endorsed by the ICANN Board on 27 March 2003. The Guidelines are supported by the .cn, .info, .jp, .org, and .tw registries, which have committed to abide by these Guidelines in their IDN operations. In June 2003, ICANN began authorizing registries having agreements with ICANN to deploy IDNs according to the provisions of the Guidelines.

As the deployment of IDNs proceeds, ICANN and the IDN registries will review these Guidelines at regular intervals, and revise them as necessary based on experience.

Guidelines

1. Top-level domain registries that implement internationalized domain name capabilities will do so in strict compliance with the technical requirements described in RFCs 3490, 3491, and 3492 (collectively, the "IDN standards").

2. In implementing the IDN standards, top-level domain registries will employ an "inclusion-based" approach (meaning that code points that are not explicitly permitted by the registry are prohibited) for identifying permissible code points from among the full Unicode repertoire.

3. In implementing the IDN standards, top-level domain registries will (a) associate each registered internationalized domain name with one language or set of languages, (b) employ language-specific registration and administration rules that are documented and publicly available, such as the reservation of all domain names with equivalent character variants in the languages associated with the registered domain name, and, (c) where the registry finds that the registration and administration rules for a given language would benefit from a character variants table, allow registrations in that language only when an appropriate table is available.

4. Registries will work collaboratively with relevant and interested stakeholders to develop language-specific registration policies (including, where the registry determines appropriate, character variant tables), with the objective of achieving consistent approaches to IDN implementation for the benefit of DNS users worldwide. Registries will work collaboratively with each other to address common issues, through, for example, ad hoc groups, regional groups, and global fora, such as the ICANN IDN Registry Implementation Committee.

5. In implementing the IDN standards, top-level domain registries should, at least initially, limit any given domain label (such as a second-level domain name) to the characters associated with one language or set of languages only.

6. Top-level domain registries (and registrars) should provide informational resources and services in all languages for which they offer internationalized domain name registrations.

Notes

Note to Guideline 1 for Registries Having Agreements with ICANN: Registries with sponsorship agreements or registry agreements with ICANN must also comply with the format requirements for Registered Names in their sponsorship or registry agreements. In one way or another, the agreements state that all Registered Names (including ACE names) will comply with the following syntax in augmented Backus-Naur Form (BNF) as described in RFC 2234:

dot = %x2E ; "."
dash = %x2D ; "-"
alpha = %x41-5A / %x61-7A ; A-Z / a-z
digit = %x30-39 ; 0-9
ldh = alpha / digit / dash
id-prefix = alpha / digit
label = id-prefix [*61ldh id-prefix]
sldn = label dot label
hostname = *(label dot) sldn

In addition, length limitations should be observed.

To meet these requirements, the UseSTD3ASCIIRules flag described in RFC 3490 should be set when in performing ToASCII conversions to produce ACE names, and the resulting format restriction should be interpreted as above.

Note to Guideline 2: Except where a registry determines that an exception is appropriate, permissible code points will not include: (a) line symbol-drawing characters, (b) symbols and icons that are neither alphanumeric nor ideographic language characters, such as typographical and pictographic dingbats, (c) punctuation characters, and (d) spacing characters. The Prohibited Output profile of Section 5 of RFC 3491 also prohibits certain code points, such as spacing characters. In addition, the IDN standards have additional prohibitions that are checked outside that profile. In accord with Guideline 1, a registry may not by exception permit code points that are prohibited by the IDN standards.

Note to Guideline 3: Under Guideline 3, every internationalized-domain-name registration will be associated with a language or set of languages for the purpose of identifying a registry-established set of registration and administration rules (a “registration ruleset”) that applies to the registration. Registration rulesets will be associated with languages or set of languages. For example, a registry might specify one registration ruleset for internationalized-domain-name registrations that have been designated as “German” and another registration ruleset for internationalized-domain-name registrations that have been designated “Chinese-Japanese-Korean”. The mapping of particular languages to particular rulesets will be specified by the registry. Registrars (and ultimately registrants) will be able to specify the language or set of languages of a registration, which will determine which of the registry-established registration ruleset will be applied.

Registries will make the language-to-ruleset mapping, as well as the details of the rulesets themselves, publicly available on their websites. Thirty days notice to registrars (which may be given by public notice) will ordinarily be given of the establishment or revision of rulesets. See also Guideline 4 concerning consultation in the establishment of rulesets.

Appropriate topics for rulesets may, but will not necessarily, include: permissible Unicode code points, character variant tables, and prohibited Unicode strings, as well as other policies as the registry operator determines are appropriate. Permissible Unicode code points for different rulesets may be overlapping or even congruent.

Note to Guideline 3 Concerning Unsponsored gTLDs: Rulesets must not interfere with the equivalent access to Registry Operator's Registry Services by all ICANN-Accredited Registrars that have Registry-Registrar Agreements in effect. Registry operators of unsponsored TLDs will ordinarily give thirty days notice to ICANN and accredited, authorized registrars of the establishment or revision of rulesets. (In urgent situations, the registry operator and ICANN may agree in writing on a shorter time.)

Domain Name System
Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as""icann.org"" is not an IDN."