Root Zone LGR for script und-Laoo (Lao) | lgr-3-Lao-Script-25apr19-en |
---|
This document is mechanically formatted from the XML file for the LGR. It provides additional summary data and explanatory text. The XML file remains the sole normative specification of the LGR.
Date | 2019-04-25 |
---|---|
LGR Version | 3 |
Language | und-Laoo |
Scope | domain: "." (Root) |
Unicode Version | 6.3.0 |
This file contains Label Generation Rules (LGR) for the Lao script for the Root zone. For more details on this LGR and its development, see "Proposal for a Lao Script Root Zone LGR [Proposal]". The format of this file follows [RFC 7940].
In addition to the 51 code points according to Section 5 “Repertoire” in [Proposal], the sequence 0EB2 0EB0 has been defined to facilitate implementation of WLE rule follows-vafter-context as a context rule. The repertoire only includes code points used by languages that are actively written in the Khmer script. The repertoire is based on [MSR-4], which is a subset of [Unicode 6.3].
Each code point or range is tagged with the script or scripts that the code point is used with, and one or more references documenting sufficient justification for inclusion in the repertoire, see "References" below.
According to Section 6, "Variants" in [Proposal], this LGR defines no variants.
Some consonants have been given the tag of Cf, which indicates final consonants. Other character classes that have been used are semi-consonant, tone-mark, vowel-above, vowel-before, vowel-below and vowel-after. See Section 5 of the [Proposal].
The LGR includes the set of required default WLE rules and actions applicable to the Root Zone and defined in [MSR-4]. They are marked with ⍟. The default prohibition on leading combining marks is equivalent to ensuring that a label only starts with a consonant or vowel-before.
Rules provided in the LGR as described in Section 7 of [Proposal] reasonably restrict labels so that they conform to Lao syllable structure. These constraints are presented exclusively as context rules.
The rules are:
No context rules apply to “consonant” code points. For discussion, see Section 5.1 “Consonants” in [Proposal].
For methodology and contributors, see Sections 4 and 8 of [Proposal].
The following general references are cited in this document:
For references consulted particularly in designing the repertoire for the Lao script for the Root Zone please see details in the Table of References below. Reference [0] refers to Unicode Standard version in which corresponding code points were initially encoded. References [201], [202], [203], [204], 205], [206], & [207] correspond to sources justifying the inclusion of or classification for the corresponding code points. Single code points or ranges may have multiple source reference values.
Number of elements in Repertoire | 52 |
---|---|
Number of code points | 51 |
Number of sequences | 1 |
Longest code point sequence | 2 |
The following table lists the repertoire by code point (or code point sequence). The data in the Script and Name column are extracted from the Unicode character database. Where a comment in the original LGR is equal to the character name, it has been suppressed.
See also the legend provided below the table.
Code Point |
Glyph | Script | Name | Ref | Tags | Required Context | Comment |
---|---|---|---|---|---|---|---|
U+0E81 | ກ | Lao | LAO LETTER KO | [0], [201], [204] | Cf, consonant | Lao | |
U+0E82 | ຂ | Lao | LAO LETTER KHO SUNG | [0], [201], [204] | consonant | Lao | |
U+0E84 | ຄ | Lao | LAO LETTER KHO TAM | [0], [201], [204] | consonant | Lao | |
U+0E87 | ງ | Lao | LAO LETTER NGO | [0], [201], [204] | Cf, consonant | Lao | |
U+0E88 | ຈ | Lao | LAO LETTER CO | [0], [201], [204] | consonant | Lao | |
U+0E8A | ຊ | Lao | LAO LETTER SO TAM | [0], [201], [204] | Cf, consonant | Lao | |
U+0E8D | ຍ | Lao | LAO LETTER NYO | [0], [201], [204] | Cf, consonant | Lao | |
U+0E94 | ດ | Lao | LAO LETTER DO | [0], [201], [204] | Cf, consonant | Lao | |
U+0E95 | ຕ | Lao | LAO LETTER TO | [0], [201], [204] | consonant | Lao | |
U+0E96 | ຖ | Lao | LAO LETTER THO SUNG | [0], [201], [204] | consonant | Lao | |
U+0E97 | ທ | Lao | LAO LETTER THO TAM | [0], [201], [204] | Cf, consonant | Lao | |
U+0E99 | ນ | Lao | LAO LETTER NO | [0], [201], [204] | Cf, consonant | Lao | |
U+0E9A | ບ | Lao | LAO LETTER BO | [0], [201], [204] | Cf, consonant | Lao | |
U+0E9B | ປ | Lao | LAO LETTER PO | [0], [201], [204] | consonant | Lao | |
U+0E9C | ຜ | Lao | LAO LETTER PHO SUNG | [0], [201], [204] | consonant | Lao | |
U+0E9D | ຝ | Lao | LAO LETTER FO FON | [0], [201], [204] | consonant | = lao letter fo sung; Lao | |
U+0E9E | ພ | Lao | LAO LETTER PHO TAM | [0], [201], [204] | consonant | Lao | |
U+0E9F | ຟ | Lao | LAO LETTER FO FAY | [0], [201], [204] | Cf, consonant | = lao letter fo tam; Lao | |
U+0EA1 | ມ | Lao | LAO LETTER MO | [0], [201], [204] | Cf, consonant | Lao | |
U+0EA2 | ຢ | Lao | LAO LETTER YO | [0], [201], [204] | consonant | Lao | |
U+0EA3 | ຣ | Lao | LAO LETTER RO | [0], [204] | Cf, consonant | = lao letter lo rada; Lao | |
U+0EA5 | ລ | Lao | LAO LETTER LO | [0], [201], [204] | Cf, consonant | = lao letter lo ling; Lao | |
U+0EA7 | ວ | Lao | LAO LETTER WO | [0], [201], [204], [205] | Cf, consonant | Lao | |
U+0EAA | ສ | Lao | LAO LETTER SO SUNG | [0], [201], [204] | Cf, consonant | Lao | |
U+0EAB | ຫ | Lao | LAO LETTER HO SUNG | [0], [201], [204] | consonant | Lao | |
U+0EAD | ອ | Lao | LAO LETTER O | [0], [201], [204], [205] | consonant | Lao | |
U+0EAE | ຮ | Lao | LAO LETTER HO TAM | [0], [201], [204] | consonant | Lao | |
U+0EB0 | ະ | Lao | LAO VOWEL SIGN A | [0], [201], [205], [206] | vowel-after | follows-C-tonemark-vabove | Lao |
U+0EB1 | ັ | Lao | LAO VOWEL SIGN MAI KAN | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao |
U+0EB2 | າ | Lao | LAO VOWEL SIGN AA | [0], [201], [205], [206] | vowel-after | follows-C-tonemark-vabove | Lao |
U+0EB2 U+0EB0 | າະ | [Lao] | LAO VOWEL SIGN AA + LAO VOWEL SIGN A | [205] | follows-vbefore-consonant-cluster | Lao | |
U+0EB4 | ິ | Lao | LAO VOWEL SIGN I | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao |
U+0EB5 | ີ | Lao | LAO VOWEL SIGN II | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao |
U+0EB6 | ຶ | Lao | LAO VOWEL SIGN Y | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao |
U+0EB7 | ື | Lao | LAO VOWEL SIGN YY | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao |
U+0EB8 | ຸ | Lao | LAO VOWEL SIGN U | [0], [201], [205], [206] | vowel-below | follows-main-consonant | Lao |
U+0EB9 | ູ | Lao | LAO VOWEL SIGN UU | [0], [201], [205], [206] | vowel-below | follows-main-consonant | Lao |
U+0EBB | ົ | Lao | LAO VOWEL SIGN MAI KON | [0], [205] | vowel-above | follows-main-consonant | Lao |
U+0EBC | ຼ | Lao | LAO SEMIVOWEL SIGN LO | [0], [201], [205], [206] | semi-consonant | follows-consonant | = lao semiconsonant lo; Lao |
U+0EBD | ຽ | Lao | LAO SEMIVOWEL SIGN NYO | [0], [201], [205] | vowel-after | follows-C-tonemark-vabove | = lao semivowel ia; Lao |
U+0EC0 | ເ | Lao | LAO VOWEL SIGN E | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao |
U+0EC1 | ແ | Lao | LAO VOWEL SIGN EI | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao |
U+0EC2 | ໂ | Lao | LAO VOWEL SIGN O | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao |
U+0EC3 | ໃ | Lao | LAO VOWEL SIGN AY | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao |
U+0EC4 | ໄ | Lao | LAO VOWEL SIGN AI | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao |
U+0EC6 | ໆ | Lao | LAO KO LA | [0], [203] | sign | repetition-mark-limit | = lao may sam; Lao |
U+0EC8 | ່ | Lao | LAO TONE MAI EK | [0], [202] | tone-mark | follows-C-vabove-vbelow | Lao |
U+0EC9 | ້ | Lao | LAO TONE MAI THO | [0], [202] | tone-mark | follows-C-vabove-vbelow | Lao |
U+0ECA | ໊ | Lao | LAO TONE MAI TI | [0], [202] | tone-mark | follows-C-vabove-vbelow | Lao |
U+0ECB | ໋ | Lao | LAO TONE MAI CATAWA | [0], [202] | tone-mark | follows-C-vabove-vbelow | = lao tone mai jattawa; Lao |
U+0ECC | ໌ | Lao | LAO CANCELLATION MARK | [0], [207] | sign | follows-Cf | = lao mark mai ka lan; Lao |
U+0ECD | ໍ | Lao | LAO NIGGAHITA | [0], [201], [205], [206] | vowel-above | follows-main-consonant | = lao vowel sign or; Lao |
Legend
Throughout this LGR, a code point sequence may be annotated with a string in ALL CAPS that is constructed on the same principle as a name for a Unicode Named Sequence. No claim is made that a sequence thus annotated is in fact a named sequence, nor that the annotation in such case actually corresponds to the formal name of a named sequence.
This LGR does not specify any variants.
The following table lists all named and implicit classes with their definition and a list of their members intersected with the current repertoire (for larger classes, this list is elided).
Name | Definition | Count | Members or Ranges | Ref | Comment |
---|---|---|---|---|---|
Cf | Tag=Cf | 14 | {0E81 0E87 0E8A 0E8D 0E94 0E97 0E99-0E9A 0E9F 0EA1 0EA3 0EA5 0EA7 0EAA} | Any Lao final consonant | |
consonant | Tag=consonant | 27 | {0E81-0E82 0E84 0E87-0E88 0E8A 0E8D 0E94-0E97 0E99-0E9F 0EA1-0EA3 0EA5 0EA7 0EAA-0EAB 0EAD-0EAE} | Any Lao consonant | |
semi-consonant | Tag=semi-consonant | 1 | {0EBC} | Lao semi-consonant LO | |
tone-mark | Tag=tone-mark | 4 | {0EC8-0ECB} | Any Lao one mark | |
vowel-above | Tag=vowel-above | 7 | {0EB1 0EB4-0EB7 0EBB 0ECD} | Any Lao vowel above | |
vowel-below | Tag=vowel-below | 2 | {0EB8-0EB9} | Any Lao vowel below | |
implicit | Tag=sign | 2 | {0EC6 0ECC} | Any character tagged as sign | |
implicit | Tag=vowel-after | 3 | {0EB0 0EB2 0EBD} | Any character tagged as vowel-after | |
implicit | Tag=vowel-before | 5 | {0EC0-0EC4} | Any character tagged as vowel-before | |
implicit | Tag=sc:Laoo | 51 | {0E81-0E82 0E84 0E87-0E88 0E8A 0E8D 0E94-0E97 0E99-0E9F 0EA1-0EA3 0EA5 0EA7 0EAA-0EAB 0EAD-0EAE 0EB0-0EB2 0EB4-0EB9 0EBB-0EBD 0EC0-0EC4 0EC6 0EC8-0ECD} | Any character tagged as Lao |
Legend
The following table lists all named rules defined in the LGR and indicates whether they are used as trigger in an action or as context (when or not-when) for a code point or variant.
Name | Regular Expression | Used as Trigger |
Anchor | Used as Context |
Ref | Comment |
---|---|---|---|---|---|---|
leading-combining-mark | (start)[[\p{gc=Mn}]∪[∅=\p{gc=Mc}]] |
✔ | Default WLE rule matching labels with leading combining marks ⍟ | |||
follows-consonant | ([:consonant:])← ⚓ |
✔ | C | WLE Rule 1: A semi-consonant must follow a consonant | ||
precedes-consonant | ⚓ →([:consonant:]) |
✔ | C | WLE Rule 2: A vowel-before precedes a main consonant cluster | ||
follows-main-consonant | ([:consonant:]|[:semi-consonant:])← ⚓ |
✔ | C | WLE Rule 3: A vowel-above, and vowel-below follow a main consonant C | ||
follows-C-tonemark-vabove | ([:consonant:]|[:semi-consonant:]|[:tone-mark:]|[:vowel-above:])← ⚓ |
✔ | C | WLE Rule 4: A vowel-after follows a main consonant, tone-mark or vowel-above | ||
consonant-cluster | [:consonant:]{1,2}[:semi-consonant:]? |
Defining consonant cluster for WLE Rule 5 | ||||
follows-vbefore-consonant-cluster | (\u0EC0(:consonant-cluster:))← ⚓ |
✔ | C | WLE Rule 5: The sequence U+0EB2 U+0EB0 (າະ) follows a vowel before, and a consonant cluster | ||
follows-C-vabove-vbelow | ([:consonant:]|[:semi-consonant:]|[:vowel-above:]|[:vowel-below:])← ⚓ |
✔ | C | WLE Rule 6: A tone-mark follows a main consonant, vowel-above or vowel-below | ||
follows-Cf | ([:Cf:])← ⚓ |
✔ | C | WLE Rule 7: The sign U+0ECC (໌) can only occur after final consonants | ||
repetition-mark-limit | ⚓ →(\u0EC6{0,2}(end)) |
✔ | C | WLE Rule 8: The sign U+0EC6 (ໆ) can only occur 0 to 3 times at the end of the label |
Legend
The following table lists the actions that are used to assign dispositions to labels and variant labels based on the specified conditions. The order of actions defines their precedence: the first action triggered by a label is the one defining its disposition.
# | Condition | Rule / Variant Set | Disposition | Ref | Comment | |
---|---|---|---|---|---|---|
1 | if label matches | leading-combining-mark | → | invalid | labels with leading combining marks are invalid ⍟ | |
2 | if at least one variant is in | {out-of-repertoire-var} | → | invalid | any variant label with a code point out of repertoire is invalid ⍟ | |
3 | if at least one variant is in | {blocked} | → | blocked | any variant label containing blocked variants is blocked ⍟ | |
4 | if each variant is in | {allocatable} | → | allocatable | variant labels with all variants allocatable are allocatable ⍟ | |
5 | if any label (catch-all) | → | valid | catch all (default action) ⍟ |
Legend
[0] | The Unicode Standard 1.1 Any code point originally encoded in Unicode 1.1 |
[201] | Lao grammar book published by the Ministry of Education in 1967,
see Appendix B, Figure 1 |
[202] | Lao grammar book published by the Ministry of Education in 1967,
see Appendix B, Figure 2 |
[203] | Lao grammar book published by the Ministry of Education in 1967,
see Appendix B, Figure 3 |
[204] | Lao grammar book published by the Ministry of Education in 2000,
see Appendix B, Figure 4 |
[205] | Lao grammar book published by the Ministry of Education in 2000,
see Appendix B, Figure 5 |
[206] | Lao grammar book published by the Ministry of Education in 2000,
see Appendix B, Figure 6 |
[207] | Lao grammar 1935, see Appendix B, Figure 7 |