<?xml version='1.0' encoding='utf-8'?>
<lgr xmlns="urn:ietf:params:xml:ns:lgr-1.0">
  <meta>
    <version comment="Proposed LGR for Sinhala">3.0</version>
    <date>2019-04-22</date>
    <unicode-version>6.3.0</unicode-version>
    <language>und-Sinh</language>
    <scope type="domain">.</scope>
    <description type="text/html"><![CDATA[
    <h1>Label Generation Rules for the Sinhala script</h1>
    
  <h2>Overview</h2>
    <p>This file contains Label Generation Rules (LGR) for the Sinhala script as would be appropriate for the 
    Root zone. For more details on this proposal see "Proposal for a 
    Sinhala Script Root Zone Label Generation Rules-Set  (LGR)" [Proposal]. 
  The format of this file follows [RFC 7940].</p>
    
  <h2>Repertoire</h2>
    <p>According to Section 5, "Repertoire" in [Proposal], the Sinhala LGR contains 72 unique code points. 
      The addition of three sequences used in the definition of variants brings the total repertoire entries to 75.
      The repertoire covers the Sinhala language as written with the Sinhala script.</p>
    
  <p>The repertoire is based on [MSR-4], which is a subset of Unicode 6.3 [Unicode 6.3].</p>
  <p>Each code-point has associated Glyph, Character Name, Category and Reference.</p>
    
    <h2>Variants</h2>
    <p>According to Section 6 "Variants", in "[Proposal]", this LGR defines variants within Sinhala
   which can cause confusion for even a careful observer. There are no cross-script variants, though 
  some confusing cases are identified. </p>

   <p>Variant Disposition: All variants are of type "blocked", making labels that differ only 
  by these variants mutually exclusive: whichever label containing either of these variants is chosen earlier would be delegated
  any other other equivalent label should be blocked. There is no preference among these variants.</p>
  
  <h2>Character Classes</h2>
  <p>As most Brahmi-derived scripts, Sinhala is an alphasyllabary writing system and written from 
  left to right. All the categories of Consonants, Vowels, Martras, Halant, Anusvara, Visarga, and Sannjakas are discussed below. </p>

  <p>Consonants: There are 40 consonants in Sinhala alphabet and 38 of them are selected for inclusion. 
  Its consonants imply inherent vowel a(අ) when they are used without dependent vowels. Absence of the 
  inherent vowel is marked by adding hal kirima (remover of the inherent vowel) to the consonant; thus ක 
  (ka) but ක් (k), and ව (va) but ව් (v). More details in Section "3.3.1 The Consonants" of the [Proposal].</p>

   <p>Vowels and Matras: There are separate symbols (dependent vowels) for all the vowels except the inherent 
  vowel අ in Sinhala. Independent vowels are used at the beginning of a word and dependent vowels (matras) are used after 
  consonants.  More details in Section 3.3.2, "The Vowels" of the [Proposal]. </p>

  <p>Halanta: Halanta (් 0DCA) which is also called halkirima or hallakuna is used to remove the 
  inherent vowel of the consonants in Sinhala. This is thus used to join consonants and form conjunct 
  characters. More details in Section "3.3.3 Halanta: The Inherent Vowel Remover" of the [Proposal]. </p>

  <p>Anusvara: The anusvara (U+0D82), pronounced /ŋ/, represents all the 
  nasals. It can be preceded by any sign except halanta (U+0DCA).  More details 
  in Section "3.3.4 The Anusvara" of the [Proposal]. </p>

  <p>Visarga: The Visargaya is a rarely used sign and pronounced as /h/. It can be
  preceded by any sign except halanta (U+0DCA). More details in 
  Section 3.3.5, "The Visarga" of the [Proposal].
   
  <p>Sannjakas: There are five separate letters for prenasalized voiced stops called sannjakas in 
  Sinhala. From among these, ඦ is not frequently used. One specification of Sannjakas is they cannot 
  be followed by halanta.  More details in Section "3.3.6 Sannjakas" of the [Proposal]. </p>
    
    <h2>Whole Label Evaluation (WLE) rules</h2>
  <h3>Default Whole Label Evaluation Rules</h3>
  <p>The LGR includes the set of required default WLE rules and actions applicable to 
    the Root Zone and defined in [MSR-4]. They are marked with &#x235F;.</p> 
    
  <h3>Sinhala specific Rules</h3>
  <p>These rules have been formulated so that they can be adopted for  LGR specification.</p>
  <p>The following symbols are used in the WLE rules: 
  <br/>C  → Consonant
  <br/>M  → Matra / Vowel Signs
  <br/>V  → Vowel
  <br/>B  → Anusvara (Bindu) 
  <br/>X  → Visarga
  <br/>H  → Halanta / Virama   
  <br/>J  → Sannjaka
  </p>

   <p>The rules are: </p>
   <ul>
   <li>1. H: must be preceded by C</li>
   <li>2. M: must be preceded by C or J</li>
   <li>3. X: must be preceded by either V, C, or M </li>
   <li>4. B: must be preceded by either V, C, J or M</li>
   </ul>

	<p>The following context rules apply to code points in variant sets to ensure the variant transitivity.</p>
    <ul>
    <li>5. variants are undefined preceding a Halant or Matra</li>
    <li>6. variants are undefined preceding an Anusvara, Visarga, Halant or Matra</li>
    </ul>


   <p>More details in Section "7  Whole Label Evaluation Rules (WLE)" of the [Proposal] </p>

  <h2>Overall Development Process and Methodology</h2>
  <p>Under the Sinhala Generation Panel, this is the Sinhala LGR, which caters to Sinhala language written 
  using the Sinhala script.</p>

  <h2>References</h2> 
  <p>Following references are cited in this document:</p>
  <dl class="references">

  <dt>[MSR-4]</dt>
   <dd>Integration Panel, "Maximal Starting Repertoire — MSR-4 Overview and Rationale", 7 February 2019,  
    https://www.icann.org/en/system/files/files/msr-4-overview-25jan19-en.pdf
   </dd> 

  <dt>[Proposal]</dt> <dd>Sinhala Generation Panel, “Proposal for a Sinhala Script Root Zone Label Generation Ruleset (LGR)”, 22 April 2019, https://www.icann.org/en/system/files/files/proposal-sinhala-lgr-22apr19-en.pdf</dd>

  <dt>[RFC 7940]</dt>
   <dd>Davies, K. and A. Freytag, "Representing Label Generation Rulesets Using XML", RFC 7940, August 2016, http://www.rfc-editor.org/info/rfc7940. 
   </dd> 

   <dt>[Unicode 6.3]</dt>
   <dd>The Unicode Consortium. The Unicode Standard, Version 6.3.0, (Mountain View, CA: The Unicode Consortium, 2013. ISBN 978-1-936213-08-5) 
   http://www.unicode.org/versions/Unicode6.3.0/</dd>
   </dl>

]]></description>
    <references>
      <reference id="102">Disanayaka, JB. 2006. Sinhala Akshara Vicharaya (Sinhala Graphology), Sumitha Publishers, Kalubovila. ISBN: 955-1146-44-1</reference>
      <reference id="201">Omniglot: The on-line encyclopedia of writing system and Languages, “Sinhala”  https://www.omniglot.com/writing/sinhala.htm</reference>
    </references>
  </meta>
  <data>
    <char comment="SINHALA SIGN ANUSVARAYA" cp="0D82" ref="102 201" tag="Anusvara" when="follows-only-V-C-J-or-M"/>
    <char comment="SINHALA SIGN VISARGAYA" cp="0D83" ref="102 201" tag="Visarga" when="follows-only-V-C-or-M"/>
    <char comment="SINHALA LETTER AYANNA" cp="0D85" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER AAYANNA" cp="0D86" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER AEYANNA" cp="0D87" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER AEEYANNA" cp="0D88" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER IYANNA" cp="0D89" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER IIYANNA" cp="0D8A" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER UYANNA" cp="0D8B" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER UUYANNA" cp="0D8C" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER IRUYANNA" cp="0D8D" ref="102 201" tag="Vowel">
      <var cp="0D9D 0DD8" not-when="followed-by-H-or-M" type="blocked"/>
      <var cp="0DC3 0DD8" not-when="followed-by-H-or-M" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER EYANNA" cp="0D91" ref="102 201" tag="Vowel">
      <var cp="0DB5" not-when="followed-by-H-or-M" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER EEYANNA" cp="0D92" ref="102 201" tag="Vowel">
      <var cp="0DB5 0DCA" not-when="followed-by-B-X-H-or-M" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER AIYANNA" cp="0D93" ref="102 201" tag="Vowel">
      <var cp="0DB5 0DD9" not-when="followed-by-H-or-M" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER OYANNA" cp="0D94" ref="102 201" tag="Vowel">
      <var cp="0DB9" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER OOYANNA" cp="0D95" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER AUYANNA" cp="0D96" ref="102 201" tag="Vowel"/>
    <char comment="SINHALA LETTER ALPAPRAANA KAYANNA" cp="0D9A" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MAHAAPRAANA KAYANNA" cp="0D9B" ref="102 201" tag="Consonant">
      <var cp="0DB6" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER ALPAPRAANA GAYANNA" cp="0D9C" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MAHAAPRAANA GAYANNA" cp="0D9D" ref="102 201" tag="Consonant">
      <var cp="0DC3" type="blocked"/>
    </char>
    <char comment="variant of IRUYANNA" cp="0D9D 0DD8" ref="102 201">
      <var cp="0D8D" not-when="followed-by-H-or-M" type="blocked"/>
      <var cp="0DC3 0DD8" not-when="followed-by-H-or-M" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER SANYAKA GAYANNA" cp="0D9F" ref="102 201" tag="Sannjaka"/>
    <char comment="SINHALA LETTER ALPAPRAANA CAYANNA" cp="0DA0" ref="102 201" tag="Consonant">
      <var cp="0DC0" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER MAHAAPRAANA CAYANNA" cp="0DA1" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER ALPAPRAANA JAYANNA" cp="0DA2" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MAHAAPRAANA JAYANNA" cp="0DA3" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER TAALUJA NAASIKYAYA" cp="0DA4" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER TAALUJA SANYOOGA NAAKSIKYAYA" cp="0DA5" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER ALPAPRAANA TTAYANNA" cp="0DA7" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MAHAAPRAANA TTAYANNA" cp="0DA8" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER ALPAPRAANA DDAYANNA" cp="0DA9" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MAHAAPRAANA DDAYANNA" cp="0DAA" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MUURDHAJA NAYANNA" cp="0DAB" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER SANYAKA DDAYANNA" cp="0DAC" ref="102 201" tag="Sannjaka"/>
    <char comment="SINHALA LETTER ALPAPRAANA TAYANNA" cp="0DAD" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MAHAAPRAANA TAYANNA" cp="0DAE" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER ALPAPRAANA DAYANNA" cp="0DAF" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MAHAAPRAANA DAYANNA" cp="0DB0" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER DANTAJA NAYANNA" cp="0DB1" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER SANYAKA DAYANNA" cp="0DB3" ref="102 201" tag="Sannjaka"/>
    <char comment="SINHALA LETTER ALPAPRAANA PAYANNA" cp="0DB4" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MAHAAPRAANA PAYANNA" cp="0DB5" ref="102 201" tag="Consonant">
      <var cp="0D91" not-when="followed-by-H-or-M" type="blocked"/>
    </char>
    <char comment="variant of  EEYANNA" cp="0DB5 0DCA" ref="102 201">
      <var cp="0D92" not-when="followed-by-B-X-H-or-M" type="blocked"/>
    </char>
    <char comment="variant of AIYANNA" cp="0DB5 0DD9" ref="102 201">
      <var cp="0D93" not-when="followed-by-H-or-M" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER ALPAPRAANA BAYANNA" cp="0DB6" ref="102 201" tag="Consonant">
      <var cp="0D9B" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER MAHAAPRAANA BAYANNA" cp="0DB7" ref="102 201" tag="Consonant">
      <var cp="0DC4" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER MAYANNA" cp="0DB8" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER AMBA BAYANNA" cp="0DB9" ref="102 201" tag="Sannjaka">
      <var cp="0D94" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER YAYANNA" cp="0DBA" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER RAYANNA" cp="0DBB" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER DANTAJA LAYANNA" cp="0DBD" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER VAYANNA" cp="0DC0" ref="102 201" tag="Consonant">
      <var cp="0DA0" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER TAALUJA SAYANNA" cp="0DC1" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER MUURDHAJA SAYANNA" cp="0DC2" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER DANTAJA SAYANNA" cp="0DC3" ref="102 201" tag="Consonant">
      <var cp="0D9D" type="blocked"/>
    </char>
    <char comment="variant of IRUYANNA" cp="0DC3 0DD8" ref="102 201">
      <var cp="0D8D" not-when="followed-by-H-or-M" type="blocked"/>
      <var cp="0D9D 0DD8" not-when="followed-by-H-or-M" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER HAYANNA" cp="0DC4" ref="102 201" tag="Consonant">
      <var cp="0DB7" type="blocked"/>
    </char>
    <char comment="SINHALA LETTER MUURDHAJA LAYANNA" cp="0DC5" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA LETTER FAYANNA" cp="0DC6" ref="102 201" tag="Consonant"/>
    <char comment="SINHALA SIGN AL-LAKUNA" cp="0DCA" ref="102 201" tag="Halant" when="follows-only-C"/>
    <char comment="SINHALA VOWEL SIGN AELA-PILLA" cp="0DCF" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN KETTI AEDA-PILLA" cp="0DD0" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN DIGA AEDA-PILLA" cp="0DD1" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN KETTI IS-PILLA" cp="0DD2" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN DIGA IS-PILLA" cp="0DD3" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN KETTI PAA-PILLA" cp="0DD4" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN DIGA PAA-PILLA" cp="0DD6" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN GAETTA-PILLA" cp="0DD8" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN KOMBUVA" cp="0DD9" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN DIGA KOMBUVA" cp="0DDA" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN KOMBU DEKA" cp="0DDB" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN KOMBUVA HAA AELA-PILLA" cp="0DDC" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN KOMBUVA HAA DIGA AELA-PILLA" cp="0DDD" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN KOMBUVA HAA GAYANUKITTA" cp="0DDE" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
    <char comment="SINHALA VOWEL SIGN DIGA GAETTA-PILLA" cp="0DF2" ref="102 201" tag="Matra" when="follows-only-C-or-J"/>
  </data>
  <rules>
    <class name="C" from-tag="Consonant" comment="Any consonant"/>
    <class name="V" from-tag="Vowel" comment="Any independent vowel"/>
    <class name="M" from-tag="Matra" comment="Any vowel sign (matra)"/>
    <class name="J" from-tag="Sannjaka" comment="Any Sannjaka"/>
    <class name="H" from-tag="Halant" comment="The Sinhala Al-Lakuna (Halant)"/>
    <class name="B" from-tag="Anusvara" comment="The Sinhala Anusvara"/>
    <class name="X" from-tag="Visarga" comment="The Sinhala Visarga"/>
    <rule name="leading-combining-mark" comment="Default rule from MSR-4 ⍟">
      <start/>
        <union>
          <class property="gc:Mn"/>
          <class property="gc:Mc"/>
        </union>
    </rule>
    <rule name="follows-only-C" comment="Section 7, WLE 1: Halanta/Virama must be preceded by C">
      <look-behind>
        <class by-ref="C"/>
      </look-behind>
      <anchor/>
    </rule>
    <rule name="follows-only-C-or-J" comment="Section 7, WLE 2: Matra must be preceded by C or J">
      <look-behind>
        <choice>
          <class by-ref="C"/>
          <class by-ref="J"/>
        </choice>
      </look-behind>
      <anchor/>
    </rule>
    <rule name="follows-only-V-C-or-M" comment="Section 7, WLE 3: Visarga must be preceded by either V, C or M">
      <look-behind>
        <choice>
          <class by-ref="V"/>
          <class by-ref="C"/>
          <class by-ref="M"/>
        </choice>
      </look-behind>
      <anchor/>
    </rule>
    <rule name="follows-only-V-C-J-or-M" comment="Section 7, WLE 4: Anusvara (Bindu) must be preceded by either V, C, J or M ">
      <look-behind>
        <choice>
          <class by-ref="V"/>
          <class by-ref="C"/>
          <class by-ref="J"/>
          <class by-ref="M"/>
        </choice>
      </look-behind>
      <anchor/>
    </rule>
    <rule name="followed-by-H-or-M" comment="variants are undefined preceding a Halant or Matra">
      <anchor/>
      <look-ahead>
        <choice>
          <class by-ref="H"/>
          <class by-ref="M"/>
        </choice>
      </look-ahead>
    </rule>
    <rule name="followed-by-B-X-H-or-M" comment="variants are undefined preceding an Anusvara, Visarga, Halant or Matra">
      <anchor/>
      <look-ahead>
        <choice>
          <class by-ref="B"/>
          <class by-ref="X"/>
          <class by-ref="H"/>
          <class by-ref="M"/>
        </choice>
      </look-ahead>
    </rule>
    <action disp="invalid" match="leading-combining-mark" comment="labels with leading combining marks are invalid ⍟"/>
    <action disp="invalid" any-variant="out-of-repertoire-var" comment="any variant label with a code point out of repertoire is invalid ⍟"/>
    <action disp="blocked" any-variant="blocked" comment="default action MSR-4 ⍟"/>
    <action disp="allocatable" any-variant="allocatable" comment="default action MSR-4 ⍟"/>
    <action disp="valid" comment="catch all; default action from MSR-4 ⍟"/>
  </rules>
</lgr>
