Skip to main content

Maximal Starting Repertoire Version 3 (MSR-3) for the Development of Label Generation Rules for the Root Zone

LOS ANGELES – 29 March 2018 - The Internet Corporation for Assigned Names and Numbers (lCANN) today announced the release of the third version of the Maximal Starting Repertoire (MSR-3). This version is upwardly compatible with MSR-2 and adds three code points each to the repertoires of Han and Latin scripts. Under the Procedure to Develop and Maintain Label Generation Rules for the Root Zone with Respect to IDN Labels [PDF, 72 KB] , the MSR is the starting point for the work by community based Generation Panels which are developing the proposals for relevant scripts for the Root Zone Label Generation Rules (RZ-LGR). The contents of MSR-3 and the detailed rationale behind its development are described in MSR-3-Overview and Rationale [PDF, 242 KB]. The RZ-LGR is a mechanism for determining valid IDN top-level domain labels and their variant labels.

MSR-3 covers the following 28 scripts: Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Myanmar, Oriya, Sinhala, Tamil, Telugu, Thaana, Tibetan, and Thai. For these scripts, MSR-3 contains 33,496 code points short-listed from 97,973 PVALID/CONTEXT code points of Unicode version 6.3.

In addition to selecting their repertoire from within the MSR for developing the RZ-LGR proposals, Generation Panels will also evaluate whether any such code points are variant code points and if any rules are needed to further constrain the labels generated using these code points. The resulting RZ-LGR proposals by the Generation Panels will be released for public comment before they are reviewed by the Integration Panel for integration into the RZ-LGR.

MSR-3 defers code points that are already encoded in later releases of Unicode. In addition, the Integration Panel monitors any scripts not included in the MSR for indications change in the MSR is warranted. Until such a change of the MSR, MSR-3 will be the foundation for any RZ-LGR versions developed. All future versions of the MSR and all versions of the RZ-LGR must retain full backwards compatibility.

About ICANN

ICANN's mission is to help ensure a stable, secure, and unified global Internet. To reach another person on the Internet, you need to type an address – a name or a number – into your computer or other device. That address must be unique so computers know where to find each other. ICANN helps coordinate and support these unique identifiers across the world. ICANN was formed in 1998 as a not-for-profit public-benefit corporation with a community of participants from all over the world.


More Announcements
Domain Name System
Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as""icann.org"" is not an IDN."