This file contains Label Generation Rules (LGR) for the Sinhala script as would be appropriate for the Root zone. For more details on this proposal see "Proposal for a Sinhala Script Root Zone Label Generation Rules-Set (LGR)" [Proposal]. The format of this file follows [RFC 7940].
According to Section 5, "Repertoire" in [Proposal], the Sinhala LGR contains 72 unique code points. The addition of three sequences used in the definition of variants brings the total repertoire entries to 75. The repertoire covers the Sinhala language as written with the Sinhala script.
The repertoire is based on [MSR-4], which is a subset of Unicode 6.3 [Unicode 6.3].
Each code-point has associated Glyph, Character Name, Category and Reference.
According to Section 6 "Variants", in "[Proposal]", this LGR defines variants within Sinhala which can cause confusion for even a careful observer. There are no cross-script variants, though some confusing cases are identified.
Variant Disposition: All variants are of type "blocked", making labels that differ only by these variants mutually exclusive: whichever label containing either of these variants is chosen earlier would be delegated any other other equivalent label should be blocked. There is no preference among these variants.
As most Brahmi-derived scripts, Sinhala is an alphasyllabary writing system and written from left to right. All the categories of Consonants, Vowels, Martras, Halant, Anusvara, Visarga, and Sannjakas are discussed below.
Consonants: There are 40 consonants in Sinhala alphabet and 38 of them are selected for inclusion. Its consonants imply inherent vowel a(අ) when they are used without dependent vowels. Absence of the inherent vowel is marked by adding hal kirima (remover of the inherent vowel) to the consonant; thus ක (ka) but ක් (k), and ව (va) but ව් (v). More details in Section "3.3.1 The Consonants" of the [Proposal].
Vowels and Matras: There are separate symbols (dependent vowels) for all the vowels except the inherent vowel අ in Sinhala. Independent vowels are used at the beginning of a word and dependent vowels (matras) are used after consonants. More details in Section 3.3.2, "The Vowels" of the [Proposal].
Halanta: Halanta (් 0DCA) which is also called halkirima or hallakuna is used to remove the inherent vowel of the consonants in Sinhala. This is thus used to join consonants and form conjunct characters. More details in Section "3.3.3 Halanta: The Inherent Vowel Remover" of the [Proposal].
Anusvara: The anusvara (U+0D82), pronounced /ŋ/, represents all the nasals. It can be preceded by any sign except halanta (U+0DCA). More details in Section "3.3.4 The Anusvara" of the [Proposal].
Visarga: The Visargaya is a rarely used sign and pronounced as /h/. It can be preceded by any sign except halanta (U+0DCA). More details in Section 3.3.5, "The Visarga" of the [Proposal].
Sannjakas: There are five separate letters for prenasalized voiced stops called sannjakas in Sinhala. From among these, ඦ is not frequently used. One specification of Sannjakas is they cannot be followed by halanta. More details in Section "3.3.6 Sannjakas" of the [Proposal].
The LGR includes the set of required default WLE rules and actions applicable to the Root Zone and defined in [MSR-4]. They are marked with ⍟.
These rules have been formulated so that they can be adopted for LGR specification.
The following symbols are used in the WLE rules:
C → Consonant
M → Matra / Vowel Signs
V → Vowel
B → Anusvara (Bindu)
X → Visarga
H → Halanta / Virama
J → Sannjaka
The rules are:
The following context rules apply to code points in variant sets to ensure the variant transitivity.
More details in Section "7 Whole Label Evaluation Rules (WLE)" of the [Proposal]
Under the Sinhala Generation Panel, this is the Sinhala LGR, which caters to Sinhala language written using the Sinhala script.
Following references are cited in this document: