Purpose: To improve the transparency and consistency of the Internationalized Domain Name (IDN) table review process and facilitate the registry operations of new gTLDs, ICANN has developed four additional reference IDN tables in machine-readable format, called reference Label Generation Rulesets (LGRs) for the second level. The reference IDN tables are based on the Guidelines for Developing Reference Label Generation Rules (LGRs), which were finalized after community review. These will be used in reviewing IDN tables submitted by the gTLD registries, e.g., through the Registry Service Evaluation Policy (RSEP) process.
Current Status: ICANN org has published reference second-level LGRs for multiple languages. Additional reference LGRs have been developed based on the detailed analysis and finalized solutions by the script community for the Root Zone Label Generation Rules (RZ-LGRs). Four LGRs are being released for Public Comment, including Arabic, Hebrew, and Sinhala script-based LGRs, and the Hebrew language-based LGR.
Next Steps: Based on the community input, these reference LGRs will be finalized and published for the use of gTLD registry operators as reference while they design their IDN tables. These reference LGRs will also be used in reviewing the IDN tables submitted by the gTLD registries.
Section I: Description and Explanation
The reference LGRs are developed in the context of either a language or a script. The script-based reference LGRs are developed based on the detailed analysis and finalized solutions by the community in the Root Zone Label Generation Rules (RZ-LGRs). The language based LGRs are also developed based on the solution available for its script in the RZ-LGRs. These include four LGRs: Arabic, Hebrew, and Sinhala script-based LGRs, and the Hebrew language-based LGR. Additional languages and scripts will be added later, as available and needed. The relevant script community has been consulted while finalizing these reference LGRs.
These reference LGRs will also be used as a baseline in reviewing the IDN tables submitted by gTLD registries, contributing to the transparency of the reviewing process. The gTLD registry operators may consult these reference LGRs while they design their IDN tables to address security and promote consistency. A registry would choose the set of code points and associated variant code points and rules that best serves its end users. An IDN table can deviate from the reference LGR, motivated by the fact that registries would like to remain competitive by offering innovative solutions to address various end user needs.
Section II: Background
The registries are generally encouraged to collaborate in defining common language-based or script-based IDN tables to allow for consistency for end users. There are multiple formats for developing IDN tables. The IDN tables used by the gTLDs and some ccTLDs are posted at the IANA Repository for IDN Practices. During the New gTLD Program's Registry System Testing (RST), ICANN org noted a large number of IDN table submissions. To be able to efficiently review the large number of IDN tables, the reference IDN tables in the machine-readable format defined in RFC7940 Representing Label Generation Rulesets Using XML were developed. These reference IDN tables allow the machine processing for repertoire, variant definitions, and the rules which would improve the consistency of the IDN table review in PDT and the Registry Service Evaluation Policy (RSEP) process.
A total of 30 language-based reference IDN tables and 13 script-based reference IDN tables have been published. The remaining languages and scripts will be included in future releases as relevant community input becomes available.
The process to develop these reference LGRs, as detailed in the guidelines, ensures both linguistic and technical expert input are incorporated in the reference LGRs, which will be finalized after the Public Comment process. These reference LGRs are intended to be comprehensive enough that they do not require further additions to be useful. At the same time, they should be relatively conservative. This should enable registries to adopt these LGRs either as is, or to take them as the basis for further modifications.
Section III: Relevant Resources
The following reference LGRs for the second level are published for Public Comments.
- Overview and Summary
- Arabic Script Reference LGR (XML, HTML)
- Hebrew Language Reference LGR (XML, HTML)
- Hebrew Script Reference LGR (XML, HTML)
- Sinhala Script Reference LGR (XML, HTML)
These files can be collectively downloaded with this package.
Section IV: Additional Information
- Already published Reference Label Generation Rules: https://www.icann.org/resources/pages/second-level-lgr-2015-06-21-en
- RFC7940 Representing Label Generation Rulesets Using XML (Label Generation Rules, LGR): https://tools.ietf.org/html/rfc7940
- Guidelines for Developing Reference LGRs for the Second Level (version 27 May 2020): https://www.icann.org/en/system/files/files/lgr-guidelines-second-level-27may20-en.pdf
- Finalized Proposals for Root Zone Label Generation Ruleset (RZ-LGR) by the Generation Panels: https://www.icann.org/resources/pages/lgr-proposals-2015-12-01-en
- LGR Toolset: https://www.icann.org/resources/pages/lgr-toolset-2015-06-21-en