Maximal Starting Repertoire Version 2 (MSR-2) for the Development of Label Generation Rules for the Root Zone
In addition to the U.N. six languages, this content is also available in
To support IDN labels in the root zone, the ICANN community, at the direction of the Board, undertook several projects to study and make recommendations on their viability and delegation. In the context of the implementation of the procedure, ICANN is pleased to announce that the Integration Panel has released the second version of the Maximal Starting Repertoire (MSR-2). This upwardly compatible version of the MSR-1 adds six additional scripts to the repertoire. The MSR is the first deliverable under the Procedure to Develop and Maintain Label Generation Rules (LGR) for the Root Zone in Respect to IDN Labels [PDF, 772 KB] (the Procedure) and the starting point for the work by community based Generation Panels to develop their LGR proposals. The LGR for the Root Zone is a mechanism for creating and maintaining rules with respect to IDN labels for the root.
The MSR-2 covers the following 28 scripts, of which six (marked with *) have now been added to MSR: Arabic, Armenian*, Bengali, Cyrillic, Devanagari, Ethiopic*, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer*, Lao, Latin, Malayalam, Myanmar*, Oriya, Sinhala, Tamil, Telugu, Thaana*, Tibetan* and Thai. MSR-2 contains 33,490 code points short-listed from 97,973 PVALID/CONTEXT code points of Unicode version 6.3.
This release of MSR-2 sets the stage for the work by Generation Panels. In addition to selecting their repertoire from within the MSR for developing LGR proposals, Generation Panels will also evaluate whether any such code points are variants and if any rules are needed to further constrain the labels generated using these code points. The resulting LGR proposals by the Generation Panels will be released for public comment before they are reviewed by the Integration Panel for integration into the Root Zone LGR. If it becomes necessary to stage the release of the LGR, for example because not all Generation Panels are able to submit proposals at the same time, subsequent versions of the LGR may be released.
MSR-2 defers some code points that are already encoded in Unicode 7.0, because authoritative tables for IDNA 2008 are not yet available for Unicode 7.0. Unicode 8.0, due in 2015, is expected to further add code points that are potentially eligible for the Root Zone. In addition, the Integration Panel monitors any scripts not included in the MSR for indications that change in status is warranted. At a later stage, another version of the MSR will be developed assuming that additional repertoire exists for which inclusion in the MSR is warranted. Until such a later version of the MSR is developed, MSR-2 would be the foundation for any LGR versions developed after its release. All future versions of the MSR and all versions of the LGR must retain full backwards compatibility.
MSR-2 release consists of the following documents:
- Maximal Starting Repertoire - MSR-2-Overview and Rationale-20150414 [PDF, 727 KB]
- MSR-2-Annotated-Han-Tables-20150413 [PDF, 43.3 MB]
- MSR-2-Annotated-Hangul-Tables-20150413 [PDF, 4.18 MB]
- MSR-2-Annotated-non-CJK-Tables-20150413 [PDF, 2.41 MB]
- MSR-2-Repertoire+WLE-Rules-20150413 [XML, 745 KB]