This document specifies a reference set of Label Generation Rules for Belarusian using a limited repertoire as appropriate for a second level domain.
All references converge on 32 Cyrillic code points (23 +9 as defined by RFC 5992 [130]). One source [302], lists U+0449 CYRILLIC SMALL LETTER SHCHA and U+044A CYRILLIC SMALL LETTER HARD SIGN as rare in Belarusian; these appear to be used in Russian words or names, not in Belarusian words and names themselves. In Russian, the hard sign indicates the non-palatalization of a consonant preceding a morpheme beginning with a iotated vowel; in Belarusian this function is met by the use of an apostrophe or of U+02BC MODIFIER LETTER APOSTROPHE.
Note that, while U+02BC MODIFIER LETTER APOSTROPHE is protocol valid (PVALID) in IDNA2008; other forms of apostrophes such as U+0027 APOSTROPHE or U+2019 RIGHT SINGLE QUOTATION MARK are DISALLOWED. As [RFC6912] points out, in a public zone, many users may read U+02BC as indistinguishable from the regular apostrophe. Therefore, following the principle of conservatism, and in response to a comment made by the IAB during public Comments, the code point U+02BC is not included here.
There is an IDN table published in the IANA Repository of IDN Practices for Belarus, not under '.by' (Belarus cctld) but in a new TLD .бел also administered by Belarus, see [700]. The following text is excerpted from clause 7 from its General Provisions:
"A domain name in domain ".бел" must contain not less than two and no more than sixty three letters of Belarusian or Russian alphabet, numbers, symbols, "hyphen" ("-") and "apostrophe" (" ' "), it must not begin (end) with the symbol "hyphen" ("-") and (or) "apostrophe" (" ' "). When choosing a domain name one should consider writing of the domains on Russian and Belarusian."
This apparently allows for support of a combination of both Belarusian and Russian and, contrary to the postition taken here, it also caters for support of U+02BC, albeit with a rule concerning the position of apostrophe within a label: neither leading nor ending.
The Acute accent may also be used as a stress mark on the vowel of a syllable or to disambiguate between minimal pairs. However, this is rare. See Stress and Disambiguation in [ACUTE].
There is some attested use of U+0438 CYRILLIC SMALL LETTER I, U+0449 CYRILLIC SMALL LETTER SHCHA, U+044A CYRILLIC SMALL LETTER HARD SIGN, and U+0491 CYRILLIC SMALL LETTER GHE WITH UPTURN, but they seem to be only used as borrowed letters from Russian and/or Ukrainian. These 4 letters are part of the extended set.
All letters in some references but not included:
U+02BC MODIFIER LETTER APOSTROPHE
U+0430 U+0301 CYRILLIC SMALL LETTER A WITH ACUTE ACCENT
U+0435 U+0301 CYRILLIC SMALL LETTER IE WITH ACUTE ACCENT
U+043E U+0301 CYRILLIC SMALL LETTER O WITH ACUTE ACCENT
U+0443 U+0301 CYRILLIC SMALL LETTER U WITH ACUTE ACCENT
U+044B U+0301 CYRILLIC SMALL LETTER YERU WITH ACUTE ACCENT
U+044D U+0301 CYRILLIC SMALL LETTER E WITH ACUTE ACCENT
U+044E U+0301 CYRILLIC SMALL LETTER YU WITH ACUTE ACCENT
U+044F U+0301 CYRILLIC SMALL LETTER YA WITH ACUTE ACCENT
U+0451 U+0301 CYRILLIC SMALL LETTER IO WITH ACUTE ACCENT
U+0456 U+0301 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I WITH ACUTE ACCENT
A number of letters not considered essential to writing the core vocabulary of the language are nevertheless in common use. Where they have not been added to the core repertoire, they are flagged as "extended-cp" in the table of code points. A context rule is provided that by default will prohibit labels with extended code points. To support extended single code points or code point sequences, delete the context "extended-cp" from their repertoire definition.
None.
This LGR defines no named character classes.
Common rules:
Hyphen Restrictions — restrictions on the allowable placement of hyphens (no leading/ending hyphen and no hyphen in positions 3 and 4). These restrictions are described in section 4.2.3.1 of RFC5891 [120]. They are implemented here as context rule on U+002D (-) HYPHEN-MINUS.
Leading Combining Marks — restrictions on the allowable placement of combining marks (no leading combining mark). This rule is described in section 4.2.3.2 of RFC5891 [120].
Actions included are the default actions for LGRs as well as those needed to invalidate labels with misplaced combining marks and apostrophe modifiers.
This reference LGR for Belarusian for the 2nd Level has been developed by Michel Suignard and Asmus Freytag, verified in expert reviews by Michael Everson, Nicholas Ostler, and Wil Tan, and based on multiple open public consultations.
General references for the language:
Mayo, Peter. 1993. "Belorussian", in Bernard Comrie & Greville G. Corbett, eds. The Slavonic languages. London; New York: Routledge. ISBN 0-415-04755-2
Ushkevich, Alexander, & Alexandra Zezulin. 1992. Byelorussian-English English Byelorussian dictionary with compete phonetics. New York: Hippocrene Books. ISBN 0-87052-114-4
Other references cited in this document:
In the listing of the repertoire by code point, references starting from [0] refer to the version of the Unicode Standard in which the corresponding code point was initially encoded. Other references, (starting from [100]) document usage of code points. For more details, see the Table of References below.
]]>