Skip to main content

Compliance with IDN technical requirements

One of the main IDN related topics from the just-finished ICANN meeting in Cairo that I think deserves some additional attention was:

Why Compliance with IDN technical requirements are a necessity on a global scale

Overall compliance with technical standards are important for TLD registry operators in order to keep their TLD stable and secure and in that way function and work well for their consumers and communities. Per ICANN Bylaws, interoperability of the Internet is a core value, which requires that technical standards are complied with. In some instances failure to comply with technical standards will only affects the corresponding TLD in isolation and does not interfere with other TLDs – when moving to the topic of IDN TLDs however this fact changes very quickly.

The following will demonstrate how non-compliance with IDN technical standards in one country or territory has a negative effect on the entire Internet community and not solely on that country/territory.

What history has shown us is that when IDNs are implemented in a manner not consistent with the IDNA protocol and IDN Guidelines it has a very negative effect on the community in general. For example, initially, under some TLDs, IDNs were implemented in a way that allowed the individual users and registrants to pick among characters across scripts when making their . registration. This resulted in visual confusability and phishing attacks.

One specific example of this is paypal.com, where the “a”‘s are Cyrillic characters and the rest are Latin letters. This address is visually the same as paypal.com (all in Latin letters), but physically, to the computer, these are two different addresses. This is damaging the uniqueness principle of the DNS – probably the most important principle of the DNS and what makes it work in a stable manner.

What further happened as a reaction to these kinds of implementations of IDNs is that application developers that need to implement the IDNA protocol in their application software in order for IDNs to work (for example in order for IDN based web addresses to resolve in a web-browser) did not follow the technical standards either. The reason behind this non-compliance has been an attempt to protect users from issues such as the above mentioned phishing attacks. For example, some browser developers have implemented white-listing of TLDs that have implemented IDNs, where the browser developer decides which TLDs are have implemented IDNs in a safe manner based on criteria set by the browser developer.

As a result the end user is presented a variety of different implementations that aim at introducing security levels that really only can be implemented and need to be implemented at the root and TLD registry level. As a consequence if two TLDs support the same language and script, they also can accept the same 2nd level domains, and vice versa, if one lookup a domain name in Unicode in one TLD then one should be able to use the same software to look up the same Unicode domain name in a different TLD – however this will not always be possible. In other instances application developers have introduced mechanisms that prevent domain names in certain scripts from resolving or otherwise functioning adequately.

If IDN implementations continue down a road of non-compliance with IDN technical requirements, such as those present in the IDNA protocol and the IDN Guidelines, it will not be possible to determine what the level of damage will be for the end-user. The worst scenarios could be one of the following two: either that IDNs will be filled with phishing attacks that IDNs will be of no use and users will be scared of using them, or restrictions in the application layer will be so strict that IDNs will for example not resolve in an adequate and at least not in a stable and secure manner. Either way, this does not provide the community what they have asked for and what we are attempting to provide them with the implementation of IDNs, namely, equal access to the DNS by all languages and scripts.

Other examples can be provided on request. These relate to reasons why the IDNA protocol is under revision and are further documented in RFC4690.

In summary the above demonstrates why compliance with the IDN technical standards are of outmost importance, and why we need to find a way of ensuring that such compliance is in place and kept in place for TLD operators with IDNs implemented, regardless of whether it is a second level or top level.

Comments

    Domain Name System
    Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as""icann.org"" is not an IDN."