﻿<?xml version="1.0" encoding="utf-8"?>
<lgr xmlns="urn:ietf:params:xml:ns:lgr-1.0">
  <meta>
    <version comment="Root Zone LGR for the Thaana Script">6</version>
    <date>2025-03-27</date>
    <language>und-Thaa</language>
    <unicode-version>11.0.0</unicode-version>
    <description type="text/html"><![CDATA[
    
    <h1 id="reference_label_generation_rules_for_the_thaana_script">Root Zone Label Generation Rules for the Thaana Script</h1>
  
    <h2 id="desc_overview">Overview</h2>
  
    <p>This file contains a set of Label Generation Rules for the Thaana script for the Root Zone. 
    For more details on this LGR and additional background on the script, see “Proposal for a Thaana Script Root Zone LGR” [Proposal-Thaana].
    This file is one of a set of LGR files that together form an integrated LGR for the DNS Root Zone [RZ-LGR-6]. 
    The format of this file follows [RFC 7940].</p>
    
    <p class="notice">This is a DRAFT document released for public comments and not final. For details on how to submit comments see the announcement for public comments on the proposal for a Thaana Script LGR on the ICANN website. https://icann.org/idn</p>
  
    <h2 id="desc_repertoire">Repertoire</h2>
    <p>The repertoire includes the 36 letters and diacritics of the Thaana script in everyday use as defined in Section 5.1 “Included Code Points” in [Proposal-Thaana].</p>
    <p>U+07B1 THAANA LETTER NAA  (baru noonu, heavy n) is a dialect specific consonant. Given the resurgence of the use of the consonant in online writing and social media, a decision was made to include it in the repertoire at this time, despite it not being officially recognized as part of the base consonants.</p>

    <p>The Thaana script contains a series of consonants used for writing Arabic loan words. Their occurrence is relatively rare and omitting them avoids some complications that might make their use problematic for the Root Zone. The primary issue is that they are used inconsistently and alternation with the equivalent ordinary consonants is common. Their inclusion would thus have necessitated the definition of in-script variants. See also Section 5.2 “Excluded Code Points” in [Proposal-Thaana]</p>

    <p>As part of the Root Zone, this LGR includes neither decimal digits nor the HYPHEN-MINUS.</p>

    <p><b>Repertoire Listing:</b> Each code point or range is tagged with the script or scripts with which the code point is used and one or more other character categories. For each repertoire element, 
  one or more references document sufficient justification for inclusion in the repertoire; see the <a href="#ref_desc_sec_References">“References”</a> below.</p>
      
    <h2 id="desc_variants">Variants</h2>

    <p>This LGR defines one set of in-script variants as described in Section 6.1  “In-script Variants”of [Proposal-Thaana]. No cross-script variants have been identified based on any discernable similarity with another script or otherwise required for the security of the Thaana LGR. In particular, the structure of the Thaana script, requiring a vowel after each consonant, removes similarity between Arabic and Thaana labels.</p>
    <p><b>Variant Disposition:</b> All variants are of type “blocked”, making labels that 
    differ only by these variants mutually exclusive: whichever label containing either of 
    these variants is chosen earlier would be delegated, while any other equivalent labels should be blocked.
    There is no preference among these labels.</p>

    <p>This LGR does not define allocatable variants.</p>

    <p>The specification of variants in the Root Zone LGR follows the guidelines in [RFC 8228].</p>

    <h2 id="desc_character_classes">Character Classes</h2>
    <p>Thaana has 25 consonants and 11 vowels. The vowels are written above/below the letters.</p>
    <ul>
        <li>
        <b>Consonants</b> — in the Thaana script, consonants are base characters and must be followed by a vowel. </p></li>
        <li>
        <b>Vowels</b> — vowels are combining marks that always follow a consonant.</li>
    </ul>

    <h2 id="desc_whole_label_evaluation_wle_and_context_rules">Whole Label Evaluation (WLE) and Context Rules</h2>

    <h3>Default Whole Label Evaluation Rules and Actions</h3>
    <p>The LGR includes the set of required default WLE rules and actions applicable to 
    the Root Zone and defined in [MSR-6]. They are marked with &#x235F;.
    The actions compute a label disposition based on WLE rules or variant mapping types.</p>

    <h3 id="desc_script-specific_rules">Script-specific Rules</h3>

    <p>The LGR defines the following script-specific rules concerning the placement of consonants and vowels.</p>
    <ul>
    <li>
        <b>follows-C</b> — WLE 1: a vowel always follows a consonant including Noonu and Raa.</li>
    <li>
        <b>followed-by-V</b> — WLE 2: a consonant, including Noonu or Raa is always followed by a vowel.</li>
    </ul>
    <p>Note that U+0782 THAANA LETTER NOONU and U+0783 THAANA LETTER RAA behave like all other consonants for the purpose of the RZ-LGR. Because the exceptions to this behavior are statistically rare, this more conservative approach was chosen which restricts a small number of labels that might be valid in other zones.</p>

    <h2>Methodology and Contributors</h2>

    <p>The Root Zone LGR for the Thaana Script was developed by the Thaana Generation Panel. For details on methodology and 
       contributors, see Sections 4 and 8 in [Proposal-Thaana], as well as [RZ-LGR-6-Overview].</p>

    <h2 id="desc_references">References</h2>
    <p>This document cites the following general references.</p>
    <dl class="references">
        <dt>[MSR-6]</dt>
        <dd>Integration Panel, “Maximal Starting Repertoire — MSR-6 Overview and Rationale”, [DATE TBD],
  https://www.icann.org/en/system/files/files/msr-6-overview-DATE-TBD-en.pdf</dd>

        <dt>[Proposal-Thaana]</dt>
        <dd>“Thaana Script Label Generation Rules for the Root Zone” , 27 March 2025, https://www.icann.org/en/system/files/files/proposal-thaana-second-level-27mar25-en.pdf</dd>

        <dt>[RFC 7940]</dt>
        <dd> Davies, K. and A. Freytag, “Representing Label Generation Rulesets Using XML”, 
     RFC 7940, August 2016, https://www.rfc-editor.org/info/rfc7940</dd>

        <dt>[RFC 8228]</dt>
        <dd>A. Freytag, “Guidance on Designing Label Generation Rulesets (LGRs) Supporting Variant Labels”, RFC 8228, August 2017,
    https://www.rfc-editor.org/info/rfc8228</dd>

        <dt>[RZ-LGR-6-Overview]</dt>
        <dd>Integration Panel, “Root Zone Label Generation Rules (RZ LGR-6): Overview and Summary”, [To be published].</dd>

        <dt>[RZ-LGR-6]</dt>
        <dd>Integration Panel, “Root Zone Label Generation Rules (RZ LGR-6)”, [To be published].</dd>

        <dt>[Unicode 11.0.0]</dt>
        <dd>The Unicode Consortium. The Unicode Standard, Version 11.0.0, (Mountain View, CA: The Unicode Consortium, 2018. ISBN 978-1-936213-19-1) 
     https://www.unicode.org/versions/Unicode11.0.0/</dd>
    </dl>
    <p>References [0] to [3] refer to the Unicode Standard versions in which the corresponding code points were initially encoded. References [401] and above correspond to sources given in [Proposal-Thaana] justifying the inclusion of the corresponding cod5 points.</p>
    ]]></description>
    <references>
      <reference id="0" comment="Any code point originally encoded in Unicode Version 1.1">The Unicode Standard, Version 1.1</reference>
      <reference id="3" comment="Any code point originally encoded in Unicode Version 3.0">The Unicode Standard, Version 3.0</reference>
      <reference id="5" comment="Any code point originally encoded in Unicode Version 3.2">The Unicode Standard, Version 3.2</reference>
      <reference id="401">Dhivehi Writing Systems by Naseema Mohamed, page 7-8, NCLHR, 1999. </reference>
      <reference id="402">Maldivian (ދިވެހި), Omniglot,  https://www.omniglot.com/writing/thaana.htm (Accessed on 5 February 2025)</reference>
    </references>
  </meta>
  <data>
    <char cp="0780" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0781" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0782" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" comment="Behaves like any other consonant for Root Zone labels">
      <var cp="07B1" type="blocked" />
    </char>
    <char cp="0783" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" comment="Behaves like any other consonant for Root Zone labels" />
    <char cp="0784" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0785" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0786" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0787" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0788" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0789" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="078A" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="078B" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="078C" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="078D" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="078E" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="078F" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0790" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0791" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0792" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0793" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0794" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0795" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0796" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="0797" when="followed-by-V" tag="consonant sc:Thaa" ref="3 401 402" />
    <char cp="07A6" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07A7" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07A8" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07A9" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07AA" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07AB" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07AC" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07AD" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07AE" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07AF" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07B0" when="follows-C" tag="sc:Thaa vowel" ref="3 401 402" />
    <char cp="07B1" when="followed-by-V" tag="consonant sc:Thaa" ref="5 401 402">
      <var cp="0782" type="blocked" />
    </char>
  </data>
  <!--Rules section goes here-->
  <rules>
    <!--Character class definitions go here-->
    <class name="C" from-tag="consonant" comment="Any Thaana consonant including Noonu and Raa" />
    <class name="V" from-tag="vowel" comment="Any Thaana vowel" />
    <!--Whole label evaluation and context rules go here-->
    <rule name="leading-combining-mark" comment="Default WLE rule matching labels with leading combining marks &#x235F;">
      <start />
      <class property="gc:Mn" />
      <class property="gc:Mc" />
    </rule>
    <rule name="follows-C" comment="WLE1: a vowel always follows a consonant.">
      <look-behind>
        <class by-ref="C" />
      </look-behind>
      <anchor />
    </rule>
    <rule name="followed-by-V" comment="WLE2: a consonant is always followed by a vowel.">
      <anchor />
      <look-ahead>
        <class by-ref="V" />
      </look-ahead>
    </rule>
    <!--Action elements go here - order defines precedence-->
    <action disp="invalid" match="leading-combining-mark" comment="labels with leading combining marks are invalid &#x235F;" />
    <action disp="invalid" any-variant="out-of-repertoire-var" comment="any variant label with a code point out of repertoire is invalid &#x235F;" />
    <action disp="blocked" any-variant="blocked" comment="any variant label containing blocked variants is blocked &#x235F;" />
    <action disp="allocatable" all-variants="allocatable" comment="variant labels with all variants allocatable are allocatable &#x235F;" />
    <action disp="valid" comment="catch all (default action) &#x235F;" />
  </rules>
</lgr>