Label Generation Rules for the Root Zone Version 4 (RZ-LGR-4)
29 June 2020 23:59 UTC
11 August 2020 23:59 UTC
Staff Report Due
25 August 2020 23:59 UTC
Purpose: To determine valid top-level Internationalized Domain Name (IDN) labels and their variant labels, the community had finalized the Procedure to Develop and Maintain the Label Generation Rules for the Root Zone in Respect of IDNA Labels (the Procedure). The Procedure requires community-based Generation Panels (GPs), organized for relevant scripts, to convene and propose specific rules. These rules are evaluated and then integrated into the Root Zone Label Generation Rules (RZ-LGR) by the Integration Panel.
Current Status: The Integration Panel has successfully evaluated the Root Zone Label Generation Rules (LGR) proposals for Bangla and Chinese scripts, as well as the updated proposal for Malayalam script. These proposals were finalized and submitted by the respective GPs, following their releases for Public Comment. The IP has integrated these proposals, along with other scripts already integrated into the third version of the Root Zone LGR (RZ-LGR-3), to develop the fourth version of the Root Zone LGR (RZ-LGR-4).
Next Steps: As per the Procedure, RZ-LGR-4 is being released for Public Comment to gather community feedback for its finalization. Proposals for additional scripts will be integrated in future versions of the RZ-LGR.
Section I: Description and Explanation
As per the Procedure which guides this work, the RZ-LGR is developed with the GPs starting their analysis from the current version of the Maximal Starting Repertoire (MSR) and developing a proposal for the respective script(s) based on the principles and additional considerations presented in the Procedure. The RZ-LGR-4 is designed to be the fourth edition of a RZ-LGR that meets the requirement for a conservative set of label generation rules for stable and secure operation of the Internet's Root Zone. RZ-LGR-4 contains rules for 18 scripts, including Arabic, Bangla, Chinese, Devanagari, Ethiopic, Georgian, Gujarati, Gurmukhi, Hebrew, Kannada, Khmer, Lao, Malayalam, Oriya, Sinhala, Tamil, Telugu and Thai, based on the proposals submitted by the respective GPs. The Integration Panel also considered the Armenian and Cyrillic script proposals, but as it has interactions with the LGRs of Greek and Latin scripts which are being developed, it was deemed prudent to delay their integration.
RZ-LGR provides a specification to mechanically determine valid IDN Top-Level Domains (TLDs). The RZ-LGR also determines the corresponding set of blocked and allocatable variant labels. Additional mechanisms need to be developed to determine which, if any, of the allocatable variant labels generated by the RZ-LGR will be allocated to the applicants.
The current version of the RZ-LGR will be followed by future versions that will support additional scripts and writing systems, as proposals from more GPs become available. It is necessary to ensure that these future additions are upwardly compatible. In addition to the panels which have already completed, work is also underway by Greek, Japanese, Korean, Latin and Myanmar panels. GPs for additional scripts, including Thaana and Tibetan are being formed.
Section II: Background
The Root Zone LGR development procedure requires three steps. Initially, the Integration Panel creates the Maximal Starting Repertoire (MSR) for the GPs to initiate their work. Based on the latest version of the MSR, the community-based GPs organize and develop proposals for the RZ-LGR for their respective scripts or writing systems. After Public Comment, these proposals are submitted to the Integration Panel for evaluation. Finally, the successfully evaluated proposals are integrated into the next version of RZ-LGR.
The current MSR-4 covers the following 28 scripts: Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Myanmar, Oriya, Sinhala, Tamil, Telugu, Thaana, Thai, and Tibetan, and is based on Unicode version 6.3.
Successful development of RZ-LGR depends on having a community-based GP for each script or writing system. A GP develops a LGR proposal to be used to generate valid TLD labels and their variant labels for the relevant script or writing system. Each proposal contains the valid code points, their variant code points and Whole Label Evaluation (WLE) rules. In doing so, the GP may need to coordinate efforts with other GPs, whenever their repertoires either overlap or are closely related. Each proposal is reviewed by the community through Public Comment process before submission to the IP for further consideration.
In the Procedure it is stated that the Integration Panel creates a set of recommended label generation rules that integrates all the approved proposals from the GPs. When the IP has created such a set, it is posted for Public Comment using the prevailing ICANN procedures. At the end of the Public Comment period, the IP receives and reviews the Public Comment to finalize the LGR. The resulting label generation rules become the next versions of the RZ-LGR.
Section III: Relevant Resources
The following Root Zone Label Generation Rules version 4 (RZ LGR-4) files are published for Public Comment.
- Overview and Summary: https://www.icann.org/sites/default/files/lgr/lgr-4-overview-29jun20-en.pdf
- Repertoire Tables, non-CJK: https://www.icann.org/sites/default/files/lgr/lgr-4-non-cjk-29jun20-en.pdf
- Repertoire Tables, Han: https://www.icann.org/sites/default/files/lgr/lgr-4-han-29jun20-en.pdf
XML versions (normative):
- Common: https://www.icann.org/sites/default/files/lgr/lgr-4-common-29jun20-en.xml
- Arabic: https://www.icann.org/sites/default/files/lgr/lgr-4-arabic-script-29jun20-en.xml
- Bangla: https://www.icann.org/sites/default/files/lgr/lgr-4-bengali-script-29jun20-en.xml
- Chinese: https://www.icann.org/sites/default/files/lgr/lgr-4-chinese-script-29jun20-en.xml
- Devanagari: https://www.icann.org/sites/default/files/lgr/lgr-4-devanagari-script-29jun20-en.xml
- Ethiopic: https://www.icann.org/sites/default/files/lgr/lgr-4-ethiopic-script-29jun20-en.xml
- Georgian: https://www.icann.org/sites/default/files/lgr/lgr-4-georgian-script-29jun20-en.xml
- Gujarati: https://www.icann.org/sites/default/files/lgr/lgr-4-gujarati-script-29jun20-en.xml
- Gurmukhi: https://www.icann.org/sites/default/files/lgr/lgr-4-gurmukhi-script-29jun20-en.xml
- Hebrew: https://www.icann.org/sites/default/files/lgr/lgr-4-hebrew-script-29jun20-en.xml
- Kannada: https://www.icann.org/sites/default/files/lgr/lgr-4-kannada-script-29jun20-en.xml
- Khmer: https://www.icann.org/sites/default/files/lgr/lgr-4-khmer-script-29jun20-en.xml
- Lao: https://www.icann.org/sites/default/files/lgr/lgr-4-lao-script-29jun20-en.xml
- Malayalam: https://www.icann.org/sites/default/files/lgr/lgr-4-malayalam-script-29jun20-en.xml
- Oriya: https://www.icann.org/sites/default/files/lgr/lgr-4-oriya-script-29jun20-en.xml
- Sinhala: https://www.icann.org/sites/default/files/lgr/lgr-4-sinhala-script-29jun20-en.xml
- Tamil: https://www.icann.org/sites/default/files/lgr/lgr-4-tamil-script-29jun20-en.xml
- Telugu: https://www.icann.org/sites/default/files/lgr/lgr-4-telugu-script-29jun20-en.xml
- Thai: https://www.icann.org/sites/default/files/lgr/lgr-4-thai-script-29jun20-en.xml
HTML versions of the XML files (non-normative, for easier readability):
- Common: https://www.icann.org/sites/default/files/lgr/lgr-4-common-29jun20-en.html
- Arabic: https://www.icann.org/sites/default/files/lgr/lgr-4-arabic-script-29jun20-en.html
- Bangla: https://www.icann.org/sites/default/files/lgr/lgr-4-bengali-script-29jun20-en.html
- Chinese: https://www.icann.org/sites/default/files/lgr/lgr-4-chinese-script-29jun20-en.html
- Devanagari: https://www.icann.org/sites/default/files/lgr/lgr-4-devanagari-script-29jun20-en.html
- Ethiopic: https://www.icann.org/sites/default/files/lgr/lgr-4-ethiopic-script-29jun20-en.html
- Georgian: https://www.icann.org/sites/default/files/lgr/lgr-4-georgian-script-29jun20-en.html
- Gujarati: https://www.icann.org/sites/default/files/lgr/lgr-4-gujarati-script-29jun20-en.html
- Gurmukhi: https://www.icann.org/sites/default/files/lgr/lgr-4-gurmukhi-script-29jun20-en.html
- Hebrew: https://www.icann.org/sites/default/files/lgr/lgr-4-hebrew-script-29jun20-en.html
- Kannada: https://www.icann.org/sites/default/files/lgr/lgr-4-kannada-script-29jun20-en.html
- Khmer: https://www.icann.org/sites/default/files/lgr/lgr-4-khmer-script-29jun20-en.html
- Lao: https://www.icann.org/sites/default/files/lgr/lgr-4-lao-script-29jun20-en.html
- Malayalam: https://www.icann.org/sites/default/files/lgr/lgr-4-malayalam-script-29jun20-en.html
- Oriya: https://www.icann.org/sites/default/files/lgr/lgr-4-oriya-script-29jun20-en.html
- Sinhala: https://www.icann.org/sites/default/files/lgr/lgr-4-sinhala-script-29jun20-en.html
- Tamil: https://www.icann.org/sites/default/files/lgr/lgr-4-tamil-script-29jun20-en.html
- Telugu: https://www.icann.org/sites/default/files/lgr/lgr-4-telugu-script-29jun20-en.html
- Thai: https://www.icann.org/sites/default/files/lgr/lgr-4-thai-script-29jun20-en.html
Section IV: Additional Information
Finalized Proposals for Root Zone Label Generation Ruleset (RZ-LGR) by the Generation Panels: https://www.icann.org/resources/pages/lgr-proposals-2015-12-01-en
Maximal Starting Repertoire version 4: https://www.icann.org/resources/pages/msr-2015-06-21-en
The Procedure: Procedure to Develop and Maintain the Label Generation Rules for the Root Zone in Respect of IDNA Labels
Call for Generation Panels: Call for Generation Panels to develop Root Zone Label Generation Rules
LGR Toolset: https://www.icann.org/resources/pages/lgr-toolset-2015-06-21-en
Report of Public Comments