Frequently Asked Questions
- What is an IDN?
- What is an IDN variant label?
- What are IDN variant TLDs?
- What are label generation rules?
- What is the IDN Root Label Generation Rules Procedure?
- How was the community involved in developing the LGR Procedure?
- What exactly will the unified IDN Root Label Generation Rules include?
- What's the relationship between the IDN Root LGR Procedure and IDN variants?
- Are the rules script-specific or language-specific?
- How would the LGR Procedure impact new gTLD and IDN ccTLD Processes?
- What writing-system communities can use the LGR Procedure?
- How early can the first release of the IDN Root Label Generation Rules be published?
- Would adoption of the LGR Procedure mean approval for the delegation of IDN Variant TLDs?
- Are visually similar variants accounted for in the LGR Procedure?
IDNs (Internationalized Domain Names) are domain names that include characters (known as "code points") other than the letters of the basic Latin alphabet (the 26 letters "a" to "z"), numbers 0-9, and hyphen "-". IDNs enable domain names to be expressed in languages other than those based on the basic Latin script by using character sets such as Chinese, Arabic, Cyrillic or any other characters outside of US-ASCII.
There is no universally accepted definition of an IDN variant. One possible definition is an IDN variant is a string that contains one or more alternate code points (or sequence of code points) that have been substituted for a code point (or sequence of code points) in another string that may be considered the "same." For example, a string in traditional Chinese characters commonly has an equivalent string in simplified Chinese characters such as the label for "top-level domain" written in traditional Chinese "頂級域名" and in simplified Chinese "顶级域名". Another possible example in Latin characters would be "encyclopædia" and "encyclopaedia" — while comprised of different code points, these are considered variants of one another.
An IDN variant top-level domain (TLD) is a label in the root zone that may be considered exchangeable with another TLD label in the root zone because it includes one or more variant characters. Because all Internet users share the root zone, it is not possible to implement a simple set of rules for a single language, as is common with the variant rules for individual country-code domains. Therefore, the rules governing which characters are allowed in the zone must be carefully and conservatively implemented to ensure the stability and security of the Internet.
Label Generation Rules (LGR) govern the labels that are permissible in a zone, such as the root zone. Every zone has rules that determine which labels are permitted, as well as other restrictions. From a registrant's perspective, these rules determine, at a high level, which characters may be available for creating a domain name.
The IDN Root LGR Procedure (LGR Procedure) defines the process for creating and maintaining the Label Generation Rules (LGR) for the root zone. These rules are used to determine what characters (Unicode code points) are permitted for use in top-level domain labels, what variant labels (if any) are possible for allocation and what variant labels (if any) are automatically blocked.
The LGR Procedure consists of two passes. During the first pass, a generation panel creates a candidate set of label generation rules (LGR) specific to a script (writing system). Each generation panel submits its candidate LGR to the Integration Panel for approval. During the second pass, the integration panel reviews each candidate LGR and, if approved, integrates each script LGR into a single unified LGR for the root zone. To learn more about the LGR Procedure, please review the Procedure to Develop and Maintain the Label Generation Rules for the Root Zone in Respect of IDNA Labels [PDF, 772 KB].
The LGR Procedure is the result of joint ICANN staff-community work on the IDN Variant TLD Program. The first phase of the study of IDN Variant Issues was conducted through six individual script case study teams who investigated the issues relevant to individual scripts. The case study teams were comprised of a total of 66 experts from 29 countries and territories, and they offered expertise in the areas of DNS, IDNA, linguistics, security & scalability, policy, registry/registrar operations, and community representation. The next phase of the project included the development of the Integrated Issues Report [PDF, 2.15 MB]. ICANN was assisted in this work by a coordination team comprised of representatives from the case study teams. Following the recommendations identified within the Integrated Issues Report, the ICANN Project team created and refined a project plan for next steps describing three projects to be completed in the Program's third phase, one of the projects being the development of the root LGR Procedure. The approach taken to develop the LGR Procedure was to form a project team consisting of Internet community volunteers from across the globe representing multiple scripts and languages, as well as ICANN staff and expert consultants.
In addition to participation within the Program's teams, the Internet community has also been engaged via user panels, webinars, public comment and consultation periods, and conference presentations and discussions. All published reports are available via the IDN Variant TLD web page.
The unified LGR will specify the following:
- The complete list of code points permissible in U-labels in the root zone.
- The complete code point variant rules (if any) for each code point. For example, if a given code point has one or more variant characters, these characters will be specified.
- Assignment of one or more "tags" to each code point in the repertoire including variants. While technically defined as language tags, these tags are used at the time of application for a U-label in the root zone in order to identify the relevant portion of the Unicode repertoire to consult. Every code point in an applied-for U-label must share the same tag or the application is invalid.
- The disposition of the labels resulting from the application of the rules in (2). Depending on the variant's disposition and on how the variant label was generated, this element specifies whether a resulting variant label is blocked or allocatable.
- A whole-label evaluation rule that determines whether the original, applied-for candidate label as well as each of its variants is permitted in the root zone.
The LGR Procedure not only determines which Unicode code points are permissible in the root zone, but also the rules governing which code point (or sequence of code points) may be considered variants and how they may be managed.
The LGR Procedure takes a script-specific approach to generating rules. Because a script may support multiple languages, the LGR Procedure requires that generation panels consider all languages that rely on a given script.
The LGR Procedure provides a transparent process for the Internet community to understand and comment upon which characters are permitted in the root zone. The resulting unified LGR will provide the Internet community with a predictable (and easily consumed) system for evaluating potential domains. Please note that the LGR Procedure plays no role in evaluating TLD applications.
The unified Root Zone LGR, which is the end result of the LGR Procedure, will impact existing rules and processes for gTLDs and IDN ccTLDs. Both the new gTLD and IDN ccTLD Programs will be updated during the current phase of the program to incorporate the IDN LGR for the Root Zone in the respective evaluation and processing steps. Updates to impacted processes should also account for recommendations from the Report on User Experience Implications of Active Variant TLDs study [PDF, 1.38 MB]. The ICANN Board has requested that interested Supporting Organizations and Advisory Committees provide any input and guidance they may have to be factored into implementations of the LGR Procedure.
Any writing system that has a living language community is welcomed to request the formation of a generation panel. The integration panel will make the final determination considering evidence of actual speakers and writers of a language. Writing systems that are excluded on the basis of no living language community do not get considered when establishing generation panels, and therefore do not get reviewed.
As of June 2013, generation panels and the integration panel are in the early stages of being recruited. It is difficult to estimate the time required for a generation panel to submit a candidate LGR and for the integration panel to approve the proposal. Please visit the ICANN IDN Variant TLDs web page for updates or subscribe to the RSS feed.
No. The LGR Procedure is an important prerequisite for delegation of IDN Variant TLDs. Even if the LGR determines that a given character is permitted in the root zone, ICANN procedures will determine if a character may be delegated.
The LGR Procedure stipulates that in investigating possible variant relationships, generation panels should ignore cases in which the relationship is based exclusively on aspects of visual similarity. The LGR Procedure focuses on "exchangeable code point variants" only. However, visually similar characters may in fact be addressed via the LGR Procedure, but not due to visual similarity rather due to code point exchangeability.
Additional Questions? Please send them to us at firstname.lastname@example.org