Skip to main content

ICANN Statement on IDN Homograph Attacks and Request for Public Comment

ICANN is aware of the recent publicity regarding the vulnerability of certain web browsers to URI and domain name spoofing that relies on the use of Internationalised Domain Name (IDN) resolution.

Homograph domain name spoofing works by exploiting the visual resemblance, or near resemblance of certain characters and symbols. These can be characters in the standard ASCII character set (such as the resemblance between the numeral "1" and the lower-case letter "l" or the letter "O" and the numeric zero ("0") in some fonts), or characters taken from different languages (such as the character "Β" [Greek capital letter Beta], and the character "B" [Latin capital letter B], or the potential confusion amongst Chinese, Japanese, and Korean character sets). The vulnerability identified by the recently publicised advisory (http://www.shmoo.com/idn/homograph.txt) is focused on how standard punycode-based IDNs offer additional opportunities for homograph attacks. The Internet community recognises that homograph domain name and URI spoofing is a problem that pre-exists the adoption of IDN implementation standards, but increasing the total number of characters available for domain names inevitably increases the opportunities for character confusion and spoofing.

While the recent publicising of the IDN-based homograph attack potential has brought this issue to wider public attention, the possibilities of the expansion of homograph exploits has been a topic of research and discussion within the ICANN community since before the adoption of IDN standards. Significant work has been done to define implementation practices such as IDN Language Registry Tables, and guidelines for restricting or managing mixed-character-set domain name registrations. These and other Best Current Practice guidelines are being defined by the global Internet community to enable the successful use of IDNs.

ICANN is concerned about the potential exacerbation of homograph domain name spoofing as IDNs become more widespread, and is equally concerned about the implementation of countermeasures that may unnecessarily restrict the use and availability of IDNs. ICANN calls for views and positions regarding both homograph vulnerability, which is not unique to IDNs, and the proposed countermeasures, which include having browser support for IDNs turned off by default, while at the same time not protecting against older forms of URI and domain name abuse.

ICANN encourages the global Internet community to participate in this public comment forum as part of an effort to improve public protection from abusive use of domain names while responsibly opening up opportunities for non-Latin language characters to be used in registered domain names.


More Announcements
Domain Name System
Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as""icann.org"" is not an IDN."