Skip to main content

Clearing the Confusion (Fast Track)

Since the launch of the Fast Track Process, ICANN has received many questions about how the DNS Stability Panel will determine a confusingly similar string; that is, a requested string that is confusing similar with an existing ccTLD, gTLD or applied-for TLDs.

The overall rules seem clear:

1) If you apply for an IDN ccTLD that is confusingly similar with an existing ccTLD, gTLD, or reserved name, then your request will be declined.

2) If you request an IDN ccTLD that is confusingly similar to a “validated” IDN ccTLD, then your request will be declined.

3) If you request an IDN ccTLD that is confusingly similar to another IDN ccTLD under evaluation, and yet not “validated”, then both request will be placed on hold until a solution is found.

4) If you request an IDN ccTLD that is confusingly similar to an applied-for gTLD string that has reached Board approval, and hence considered an existing TLD, then your request will be declined.

5) If you request an IDN ccTLD that is confusingly similar to an applied-for gTLD string, then both parties will be informed.

Validation, for the purpose of the Fast Track Process means that it has been established that the string is a meaningful representation of the corresponding country/territory name, and that it has successfully passed the DNS Stability Panel evaluation.

However, it is the notion of confusingly similar and exactly how it is established that two or more strings are so confusingly similar that they cannot co-exist in the DNS, that reasonably is raising questions.

As the Final Implementation Plan states, any such determination is on a case-by-case basis. However, it is probably useful to provide some insight into how the panel makes such a determination.

While the determination is done by the DNS Stability Panel, Fast Track participants should know that ICANN staff will provide them with concerns about confusability (if such is found) during the initial review of a Fast Track request. The requester then has the opportunity to either (i) change the string they requested, (ii) withdraw the request and resubmit at a later stage, or (iii) continue with the request as originally submitted.

Type styles, fonts, etc.

Issue: A sufficiently creative choice of type styles or the exploitation of information about scripts that a given user may be unable to display can result in one character (or a sequence of characters) in one script being visually confusable with one or more characters (or character sequence(s)) in another script.

The issue becomes even more serious for closely related scripts (for example, Greek/Latin/Cyrillic).

While we are aware of the issues, some level of risk must be accepted. These kinds of issues cannot be completely guarded against, especially as type styles and fonts (just like languages and scripts) evolve and change over time.

Instead, determining confusability is focused on issues that may arise from the basic geometry of characters that is preserved, to a greater or lesser degree, across a variety of fonts, styles, and formatting.

Two-character strings

Issue: Two-character strings that consist of Unicode code points in scripts such as the Latin, Greek, and Cyrillic script blocks are intrinsically confusable with currently defined or potential future country code TLD (ccTLD) strings based on the ISO 3166-1 alpha-2 codes.

This is particularly true when variations in font and presentation interface are considered. And it is not limited to the pairs of “visually confusable characters” identified in Unicode Technical Report #39. Those characters are based on Unicode Reference Fonts that are deliberately designed to reduce the potential for visual confusion.

Therefore, a very conservative standard is being used to assess applied-for strings that consist of two Greek, Cyrillic, or Latin characters, including a default presumption of confusability to which exceptions may be made in specific cases.

How are strings ranked?

The Fast Track Process recognizes the following rankings for requested two-character IDN ccTLD strings. The higher the rank the more likely the applied-for string as a whole presents a significant risk of user confusion.

[6] Both characters are visually identical to an ISO 646 Basic Version (ISO 646-BV*) character. [International Organization for Standardization, "Information Technology – ISO 7-bit coded character set for information interchange," ISO Standard 646, 1991.]

[5] One character is visually identical to, and one character is visually confusable with, an ISO 646-BV character.

[4] Both characters are visually confusable with, but neither character is visually identical to, an ISO 646-BV character.

[3] One character is visually distinct from, and one character is visually identical to, an ISO 646-BV character.

[2] One character is visually distinct from, and one character is visually confusable with, an ISO 646-BV character.

[1] Both characters are visually distinct from an ISO 646-BV character.

Some disagreement may arise in assessing whether a string is confusingly similar with existing ccTLDs, gTLDs, or applied-for strings. Thus, these rankings are for guidance only, and the DNS Stability Panel makes its assessment based on the rankings and on the expertise of the panelists. In difficult situations, the panel may conduct extended evaluations that also can include drawing on additional linguistic expertise.

The likelihood of user confusion presented by a given two-character IDN ccTLD string does not depend strictly on the individual confusability of each character, if considered separately. The assessment of “visually distinct” and “visually confusable” takes into account both the individual features of each character and their combined effect.

In general, a two-character IDN string at rank [4] or higher presents a significant risk of user confusion.

In general, a two-character IDN string at rank [3] or lower does not present a significant risk of user confusion.

What about confusable strings already in the DNS root zone?

Some have argued that we already have TLDs in the DNS root zone that could be considered confusingly similar, so there is no need to prevent future confusingly similar strings from being entered in the root zone as well. There is only one answer to such statement: Just because there are issues today does not mean that we should make it worse for the future!

Finally, thank you to the DNS Stability Panel for all their work in this area and for generating the rankings based on their professional experience and prelaunch training!


    Domain Name System
    Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as"""" is not an IDN."