ICANN is releasing for public comment version 3 of the Maximal Starting Repertoire (MSR-3: HTML, XML). This version is upwardly compatible with MSR-2 and adds three code points each to the repertoires of Han and Latin scripts. Under the Procedure to Develop and Maintain Label Generation Rules for the Root Zone with Respect to IDN Labels [PDF, 772 KB], the MSR is the starting point for the work by community based Generation Panels which are developing the proposals for relevant scripts for the Root Zone Label Generation Rules (RZ-LGR). The contents of MSR-3 and the detailed rationale behind its development are described in MSR-3-Overview and Rationale [PDF, 1.1 MB].
The Generation Panels currently use MSR-2, which covers 28 scripts: Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Myanmar, Oriya, Sinhala, Tamil, Telugu, Thaana, Tibetan and Thai. MSR-2 contains 33,490 code points short-listed from 97,973 PVALID/CONTEXT code points of Unicode version 6.3.
MSR-3 will cover the same scripts. The Integration Panel will finalize the code point repertoire for MSR-3 based on the feedback received by the community. After the release of MSR-3, Generation Panels which are developing their RZ-LGR proposals will be able to use the updated contents as a starting point for their analysis.
Section I: Description and Explanation
The MSR is a subset of IDNA 2008 PVALID code points for Unicode 6.3 (latest version of the Unicode Standard for which IANA provides IDNA 2008 tables), created by following the prescriptions of Procedure to Develop and Maintain Label Generation Rules for the Root Zone with Respect to IDN Labels [PDF, 772 KB] (the Procedure) in eliminating code points not eligible for the root zone. The MSR is a deliverable from the Integration Panel under the Procedure and serves as a starting collection of code points from which Generation Panels may make a selection in constructing the repertoire for their respective LGR proposals. In accordance with the Procedure, "Generation panels must not include in their proposed repertoires any assigned code point that is not included in the maximal set of code points for the root zone defined by the integration panel."
The mere presence of a code point in the MSR does not indicate that the Integration Panel considers it acceptable for inclusion in the RZ-LGR. Where the Integration Panel was not able to resolve the status of a code point, it has tended to retain it in the MSR, with the aim of allowing Generation Panels to perform a more thorough review, and where appropriate to present a justification of the inclusion of such code points in the LGR.
In contrast, the absence of a code point affirms that the Integration Panel has determined that the code point is not appropriate for the DNS root, or, in certain situations, the panel has decided to defer it to a future version of the MSR.
Section II: Background
To support IDN variants in the root zone, the ICANN community, at the direction of the Board, undertook several projects to study and make recommendations on their viability, sustainability and delegation. One of these projects is the implementation of the Procedure [PDF, 772 KB] allowing for the development of the RZ-LGR. The RZ-LGR is a mechanism for creating and maintaining rules with respect to IDN labels for the root zone. This mechanism will be used to determine which Unicode code points are permitted for use in U-Labels for the root zone, what are their variant code points (if any) and if there are any additional label-level constraints.