﻿<?xml version="1.0" encoding="utf-8"?>
<lgr xmlns="urn:ietf:params:xml:ns:lgr-1.0">
  <meta>
    <version comment="Root Zone LGR for the Bengali (Bangla) Script">6</version>
    <date>2025-09-23</date>
    <language>und-Beng</language>
    <scope type="domain">.</scope>
    <unicode-version>16.0.0</unicode-version>
    <description type="text/html"><![CDATA[
    <h1>Root Zone Label Generation Rules for the Bengali (Bangla) Script</h1>

    <h2>Overview</h2>
    <p>This file contains Label Generation Rules (LGR) for the Bengali (Bangla) script for the Root Zone. 
    This LGR covers Assamese, Bengali, Manipuri and a number of other languages written with the Bengali script. 
    For more details on this LGR and additional background on the script, see “Proposal for a Bengali Script Root 
    Zone Label Generation Ruleset (LGR)” [Proposal-Bengali].
    This file is one of a set of LGR files that together form an integrated LGR for the DNS Root Zone [RZ-LGR-6]. 
    The format of this file follows [RFC 7940].</p>

    <h2>Repertoire</h2>
    <p>The repertoire contains 61 code points for letters, as well as 9 code point sequences, for a total of 70 repertoire elements.
    Out of the nine sequences: two sequences override a WLE constraint; four sequences were defined 
    for in-script variants; and the other three sequences were defined to restrict U+09BC NUKTA from appearing 
    in any context other than these sequences. Accordingly, while U+09BC is not 
    listed by itself, it brings the total of distinct code points to 62.
    For more detail, see Section 5, “Repertoire” in [Proposal-Bengali].</p>

    <p>Note that the code points U+09DC, 09DD and U+09DF are not in Normalization Form NFC and thus not
    PVALID under IDNA2008. Before performing lookup, any user agent accepting these code points will normalize 
    them into the equivalent sequences with an explicit U+09BC Nukta code point. Accordingly, this LGR 
    does not reference these code points, but instead includes only the sequences.<p>
    
      <p>The repertoire is contained in [MSR-6], which is a subset of [Unicode 16.0.0].</p>

      <p>As part of the Root Zone, this LGR includes neither decimal digits nor the HYPHEN-MINUS.</p>

      <p><b>Repertoire Listing:</b> Each code point or range is tagged with the script or scripts with which the code point is used and one or more other character categories. For each repertoire element, 
  one or more references document sufficient justification for inclusion in the repertoire; see the <a href="#ref_desc_sec_References">“References”</a> below. 
    For code points that are part of the repertoire, comments identify the languages using the code point along with their [EGIDS] level.</p>

    <p>Code points outside the Bengali script repertoire that are listed in this file are targets
      for out-of-repertoire variants and are identified by a reflexive (identity)
      variant of type “out-of-repertoire-var”. They do not form part of the
      repertoire.</p>

    <h2>Variants</h2>
    <p>This LGR defines in-script variants and cross-script variants as described in Section 6, “Variants”, in [Proposal-Bengali].
    There are three in-script variants: two sequence sets and one set for variation of RA. See Section 6.1 of [Proposal-Bengali].
    There are six cross-script variants: two sets with Gurmukhi and four sets with Devanagari. 
    See also Section 6.2 of [Proposal-Bengali].
    </p>

    <p><b>Variant Disposition:</b> The in-script variant pair U+09B0 / U+09F0 is of type “allocatable”, thus allowing
    access to either user community. All other variants are of type “blocked”, making labels that differ only 
    by these variants mutually exclusive:  whichever label containing either of these variants is chosen earlier, 
    the other one equivalent variant label should be blocked. There is no preference among these variants.</p>

  <p><b>Context Rules for Variants:</b> The Halant is only a variant at the end of a label, when it does not partake in forming a conjunct.</p>

      <p>The specification of variants in the Root Zone LGR follows the guidelines in [RFC 8228].</p>

    <h2>Character Classes</h2>
    <p><b>Consonants:</b> All consonants contain an implicit vowel. More 
       details in Section 3.3.1, “The Consonants” of 
       [Proposal-Bengali].</p>

    <p><b>Hasanta:</b> A special sign is needed whenever the 
       implicit vowel in the preceding consonant is stripped off. This symbol is also known as the Halant or “Virama”. The nominal rendering of the Hasanta is visible only at the end of the label. More 
       details in Section 3.3.2, “The Implicit Vowel Killer: Hasanta” of 
       [Proposal-Bengali].</p>
    
    <p><b>Vowels and Kar (Matra):</b> Separate symbols exist for all “Swara” or vowels in Bengali, which are pronounced independently 
       either at the beginning of the word or after another vowel or consonant sound. To indicate a vowel sound 
       other than the implicit one, a vowel sign  (Kar) is attached to the consonant, analogous to Matra in other Neo-Brahmi scripts. More details in Section 
       3.3.3, “ Vowels” of 
       [Proposal-Bengali].</p>

    <p><b>Anusvara:</b> The Anusvara represents a homorganic nasal. It replaces a conjunct group of a Nasal 
       Consonant+Halant+Consonant belonging to that particular barga or set. Before a non-barga consonant, 
       the anusvara represents a nasal sound. More details in Section 3.3.4, “The Anusvara” of 
       [Proposal-Bengali].</p>

    <p><b>Candrabindu:</b> Candrabindu denotes nasalization of the preceding vowel as in চাঁদ /cãd/ “moon” 
       (U+099A U+09BE U+0981 U+09A6). This sign with a dot inside the half-moon mark is used as nasalization 
       marker in many Indian scripts. More details in Section 3.3.5, “Nasalization: Candrabindu” of 
       [Proposal-Bengali].</p>
    
    <p><b>Visarga and Avagraha:</b> The Visarga U+0983 is frequently used in Bengali  loanwords borrowed from 
       Sanskrit and represents a sound very close to /h/. More details in Section 3.3.7, “Visarga and Avagraha” of 
       [Proposal-Bengali].</p>

    <p><b>Ya-phala:</b> There are two instances in Bangla where a Hasanta is preceded by a full vowel (U+0985 BENGALI LETTER A and U+098F BENGALI LETTER E). 
       More details in Section 3.3.9, “Use of Ya-phala” of 
       [Proposal-Bengali].</p>
        
    <p><b>Ra-phala and Ref Sequences:</b> RA+Hasanta (Repha or Ra-phala sequences).
       More details in Section 3.3.10, “Ra-phala and Ref Sequences” of 
       [Proposal-Bengali].</p>
        
    <p><b>Nukta:</b> Nukta is not listed by itself in the repertoire; it is only included in three sequences. 
        More details in Section 3.3.6, “Nukta” of 
        [Proposal-Bengali].</p>

    <p><b>Zero Width Non-joiner (ZWNJ) and Zero Width Joiner (ZWJ)</b>: These are not included in the repertoire. 
        More details in Section 3.3.8, “Zero Width Non-joiner (U+200C) and Zero Width Joiner (U+200D)” of 
        [Proposal-Bengali].</p>

    <h2>Whole Label Evaluation (WLE) and Context Rules</h2>

    <h3>Default Whole Label Evaluation Rules and Actions</h3>
    <p>The LGR includes the set of required default WLE rules and actions applicable to 
        the Root Zone and defined in [MSR-6]. They are marked with &#x235F;. 
      The default prohibition on leading combining marks is equivalent to ensuring that 
      a label only starts with a consonant or vowel. 
      The actions compute a label disposition based on WLE rules or variant mapping types.</p>

    <h3>Bengali-specific Rules</h3>
    <p>These rules have been formulated as context rules suitable for adoption into an LGR specification.</p>
    <p>The following symbols are used in the WLE rules:</p>
    <ul>
        <li>C    →    Consonant </li>
        <li>M    →    Kar (Matra) </li>
        <li>V    →    Vowel </li>
        <li>B    →    Onushshar (Anusvara)  </li>
        <li>X    →    Bisarga (Visarga)  </li>
        <li>D    →    Candrabindu  </li>
        <li>H    →    Hasanta (Halant)</li>
        <li>Z    →    KhandaTa </li>
        <li>P    →    Ra-Hasanta</li>
        <li>S    →    (a/e) Ya-phala</li>
    </ul>
    
    <p>The rules are:</p>
     <ul>
        <li>1. C: C is a set of C and CN where CN is the set of normalized forms of {ড়,ঢ়,য়}</li> 
        <li>2. H: must be preceded by C</li>
        <li>3. M: must be preceded by C</li>
        <li>4. D: must be preceded by any of V, C, M</li>
        <li>5. X: must be preceded by any of V, C, M, D</li>
        <li>6. B: must be preceded by any of V, C, M, D</li>
        <li>7. Z: must be preceded by any of V, C, M, D, B, X, P</li>
        <li>8. V: CANNOT be preceded by H</li>
        <li>9. S: CANNOT be preceded by H</li>
        <li>10. U+09B0 and U+09F0 CANNOT be mixed in the same label</li>
     </ul>
    
     <p>More details in Section 7, “Whole Label Evaluation Rules (WLE)” of [Proposal-Bengali].</p>

    <p>The following context rule is used for variants of Halant:</p>
    <ul>
        <li>Variant is not defined unless it is followed by end of label</li>
    </ul>
    
    <h2>Methodology and Contributors</h2>
    
      <p>The Root Zone LGR for the Bengali (Bangla) Script was developed by the Neo-Brahmi Generation Panel (NBGP), the members 
     of which have experience in linguistics and computational linguistics in a wide variety of languages
     written with Neo-Brahmi scripts. Under the Neo-Brahmi Generation Panel, there are 
    nine scripts belonging to separate Unicode blocks. Each of these scripts has been assigned a 
    separate LGR, with the Neo-Brahmi GP ensuring that the fundamental philosophy behind building 
    each LGR is in sync with all other Brahmi-derived scripts. For further details on methodology and contributors, 
     see Sections 4 and 8 in [Proposal-Bengali], as well as [RZ-LGR-6-Overview].</p>

      <section id="change_history">
    <h3>Changes from RZ LGR-5</h3> 
    <p>In RZ LGR-6 the following changes were made:</p>
    <ul>
      <li>
        A clerical error has been fixed: the character class for U+0994 Letter AU has been
        corrected to Vowel with a corresponding context rule. From RZ LGR-6, affected
        labels are now reported as valid in accordance with [Proposal-Bengali].
      </li>
      <li>
        Two missing cross-script variants were added for Candrabindu and Hasanta and their Devanagari counterparts. These code points
        were mistakenly believed to not participate in the formation of possible
        variant labels.
      </li>
    </ul>
    <p>For the prior version see [RZ-LGR-5-Beng].</p>   
    </section>

    <h2>References</h2> 

    <p>The following general references are cited in this document:</p>
    <dl class="references">

     <dt>[EGIDS]</dt>
     <dd>Lewis and Simons, “EGIDS: Expanded Graded Intergenerational Disruption Scale,”
      documented in [SIL-Ethnologue] and summarized here:
      https://en.wikipedia.org/wiki/Expanded_Graded_Intergenerational_Disruption_Scale_(EGIDS)</dd>

    <dt>[MSR-6]</dt>
     <dd>Integration Panel, “Maximal Starting Repertoire — MSR-6 Overview and Rationale”, 23 September 2025,
  https://www.icann.org/en/system/files/files/msr-6-overview-23sep25-en.pdf</dd>

    <dt>[Proposal-Bengali]</dt> 
    <dd>Neo-Brahmi Generation Panel, “Proposal for a Bangla (Bengali) Script Root Zone Label 
         Generation Rule-Set (LGR)”, 20 May 2020 (PDF), https://www.icann.org/en/system/files/files/proposal-bangla-lgr-20may20-en.pdf
    </dd>

    <dt>[RFC 7940]</dt>
     <dd>Davies, K. and A. Freytag, “Representing Label Generation Rulesets Using XML”, 
     RFC 7940, August 2016, https://www.rfc-editor.org/info/rfc7940</dd> 

     <dt>[RFC 8228]</dt>
     <dd>A. Freytag, “Guidance on Designing Label Generation Rulesets (LGRs) Supporting Variant Labels”, RFC 8228, August 2017,
    https://www.rfc-editor.org/info/rfc8228</dd>

     <dt>[RZ-LGR-5-Beng]</dt>
    <dd>ICANN, Root Zone Label Generation Rules for the Bengali (Bangla) Script (und-Beng), Version 5, 26 May 2022 (XML)
     https://www.icann.org/sites/default/files/lgr/rz-lgr-5-bengali-script-26may22-en.xml <br/>
     <i>non-normative HTML presentation: https://www.icann.org/sites/default/files/lgr/rz-lgr-5-bengali-script-26may22-en.html</i></dd>
    
     <dt>[RZ-LGR-6]</dt>
     <dd>Integration Panel, “Root Zone Label Generation Rules (RZ-LGR-6)”, 23 September 2025 (XML), https://www.icann.org/sites/default/files/lgr/rz-lgr-6-common-23sep25-en.xml <br/>
     <i>non-normative HTML presentation: https://www.icann.org/sites/default/files/lgr/rz-lgr-6-common-23sep25-en.html</i></dd>
     
     <dt>[RZ-LGR-6-Overview]</dt>
     <dd>Integration Panel, “Root Zone Label Generation Rules (RZ LGR-6): Overview and Summary”, 23 September 2025, https://www.icann.org/sites/default/files/lgr/rz-lgr-6-overview-23sep25-en.pdf</dd>
     <dt>[SIL-Ethnologue]</dt>
     <dd>David M. Eberhard, Gary F. Simons &amp; Charles D. Fennig (eds.). 2021.
     Ethnologue: Languages of the World, Twenty fourth edition. Dallas, Texas: SIL
     International. Online version available as https://www.ethnologue.com</dd>
     <dt>[Unicode 16.0.0]</dt>
     <dd>
     The Unicode Consortium. The Unicode Standard, Version 16.0.0, (South San Francisco: The Unicode Consortium, 2024. ISBN 978-1-936213-34-4)
     https://www.unicode.org/versions/Unicode16.0.0/
     </dd>
     </dl>
      <p>For references consulted, particularly in designing the repertoire for the Bengali script for the Root Zone, 
  please see details in the <a href="#table_of_references">Table of References</a> below.
      References [0] and [7] refer to the Unicode Standard versions in which the
      corresponding code points were initially encoded. References [101] and above correspond to sources
      given in [Proposal-Bengali] justifying the inclusion of the corresponding code points. Entries in the table may have multiple source reference values.</p>
 ]]></description>
    <references>
      <reference id="0" comment="Any code point originally encoded in Unicode Version 1.1">The Unicode Standard, Version 1.1</reference>
      <reference id="7" comment="Any code point originally encoded in Unicode Version 4.1">The Unicode Standard, Version 4.1</reference>
      <reference id="101">Wikipedia, Bengali alphabet, accessed on 2017-11-25 https://en.wikipedia.org/wiki/Bengali_alphabet</reference>
      <reference id="102">Bengali alphabet for Manipuri, found in Omniglot, “Manipuri (Meeteilon/ Meithei)”, accessed on 20.10.2019 https://www.omniglot.com/writing/manipuri.htm</reference>
      <reference id="103">Omniglot, “Assamese (অসমীয়া)”, accessed on 2020-04-28 https://www.omniglot.com/writing/assamese.htm</reference>
    </references>
  </meta>
  <data>
    <char cp="0901" tag="sc:Deva" ref="0" comment="Not part of repertoire">
      <var cp="0901" type="out-of-repertoire-var" comment="Out-of-repertoire" />
      <var cp="0981" type="blocked" comment="Cross-script homoglyph" />
    </char>
    <char cp="092E" tag="sc:Deva" ref="0" comment="Not part of repertoire">
      <var cp="092E" type="out-of-repertoire-var" comment="Out-of-repertoire" />
      <var cp="09AE" type="blocked" comment="Cross-script homoglyph" />
      <var cp="0A38" type="blocked" comment="Cross-script homoglyph" />
    </char>
    <char cp="093F" tag="sc:Deva" ref="0" comment="Not part of repertoire">
      <var cp="093F" type="out-of-repertoire-var" comment="Out-of-repertoire" />
      <var cp="09BF" type="blocked" comment="Cross-script homoglyph" />
      <var cp="0A3F" type="blocked" comment="Cross-script homoglyph" />
    </char>
    <char cp="094D" tag="sc:Deva" ref="0" comment="Not part of repertoire">
      <var cp="094D" type="out-of-repertoire-var" comment="Out-of-repertoire" />
      <var cp="09CD" when="at-end-of-label" type="blocked" comment="Cross-script homoglyph" />
    </char>
    <char cp="0981" when="follows-only-V-C-M" tag="Candrabindu sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)">
      <var cp="0901" type="blocked" comment="Cross-script homoglyph" />
    </char>
    <char cp="0982" when="follows-only-V-C-M-D" tag="Anusvara sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0983" when="follows-only-V-C-M-D" tag="sc:Beng Visarga" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0985" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0985 09CD 09AF 09BE" not-when="follows-H" comment="Ya-Phalaa (s1): Bangla (1), Assamese (2)" />
    <char cp="0986" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0987" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0988" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0989" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="098A" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="098B" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="098F" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="098F 09CD 09AF 09BE" not-when="follows-H" comment="Ya-Phalaa (s2): Bangla (1)" />
    <char cp="0990" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0993" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0994" not-when="follows-H" tag="sc:Beng Vowel" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0995" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0996" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0997" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0998" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="0999" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="099A" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="099B" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="099C" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="099D" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="099E" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="099F" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09A0" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09A1" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09A1 09BC" ref="0 101 102 103" comment="09DC is the preferred code point, however it is not available for LGR as per the standards governing this LGR development" />
    <char cp="09A2" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)," />
    <char cp="09A2 09BC" ref="0 101 102 103" comment="09DD is the preferred code point, however it is not available for LGR as per the standards governing this LGR development" />
    <char cp="09A3" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09A4" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09A5" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09A6" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09A7" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09A8" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09A8 09CD 09A5" comment="Bengali variant">
      <var cp="09A8 09CD 09B9" type="blocked" comment="Bengali variant" />
    </char>
    <char cp="09A8 09CD 09B9" comment="Bengali variant">
      <var cp="09A8 09CD 09A5" type="blocked" comment="Bengali variant" />
    </char>
    <char cp="09AA" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09AB" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09AC" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09AD" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09AE" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)">
      <var cp="092E" type="blocked" comment="Cross-script homoglyph" />
      <var cp="0A38" type="blocked" comment="Cross-script homoglyph" />
    </char>
    <char cp="09AF" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09AF 09BC" ref="0 101 102 103" comment="09DF is the preferred code point, however it is not available for LGR as per the standards governing this LGR development" />
    <char cp="09B0" tag="C2 Consonant sc:Beng" ref="0 101 102" comment="Bangla (1), Manipuri (2)">
      <var cp="09F0" type="allocatable" comment="Bengali variant" />
    </char>
    <char cp="09B2" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09B6" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09B7" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09B8" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09B8 09CD 09A5" comment="Bengali variant">
      <var cp="09B8 09CD 09B9" type="blocked" comment="Bengali variant" />
    </char>
    <char cp="09B8 09CD 09B9" comment="Bengali variant">
      <var cp="09B8 09CD 09A5" type="blocked" comment="Bengali variant" />
    </char>
    <char cp="09B9" tag="Consonant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09BE" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09BF" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)">
      <var cp="093F" type="blocked" comment="Cross-script homoglyph" />
      <var cp="0A3F" type="blocked" comment="Cross-script homoglyph" />
    </char>
    <char cp="09C0" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09C1" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09C2" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09C3" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09C4" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 103" comment="Bangla (1), Assamese (2)" />
    <char cp="09C7" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09C8" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09CB" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09CC" when="follows-only-C" tag="Matra sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09CD" when="follows-only-C" tag="Halant sc:Beng" ref="0 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)">
      <var cp="094D" when="at-end-of-label" type="blocked" comment="Cross-script homoglyph" />
    </char>
    <char cp="09CE" when="follows-only-V-C-M-D-B-X-P" tag="Consonant KhandaTa sc:Beng" ref="7 101 102 103" comment="Bangla (1), Manipuri (2), Assamese (2)" />
    <char cp="09F0" tag="C2 Consonant sc:Beng" ref="0 103" comment="Assamese (2)">
      <var cp="09B0" type="allocatable" comment="Bengali variant" />
    </char>
    <char cp="09F1" tag="Consonant sc:Beng" ref="0 102 103" comment="Manipuri (2), Assamese (2)" />
    <char cp="0A38" tag="sc:Guru" ref="0" comment="Not part of repertoire">
      <var cp="092E" type="blocked" comment="Cross-script homoglyph" />
      <var cp="09AE" type="blocked" comment="Cross-script homoglyph" />
      <var cp="0A38" type="out-of-repertoire-var" comment="Out-of-repertoire" />
    </char>
    <char cp="0A3F" tag="sc:Guru" ref="0" comment="Not part of repertoire">
      <var cp="093F" type="blocked" comment="Cross-script homoglyph" />
      <var cp="09BF" type="blocked" comment="Cross-script homoglyph" />
      <var cp="0A3F" type="out-of-repertoire-var" comment="Out-of-repertoire" />
    </char>
  </data>
  <!--Rules section goes here-->
  <rules>
    <!--Character class definitions go here-->
    <class name="C-single" from-tag="Consonant" comment="Any Bengali consonant" />
    <class name="V" from-tag="Vowel" comment="Any Bengali vowel letter" />
    <class name="M" from-tag="Matra" comment="Any Bengali vowel sign (matra)" />
    <class name="H" from-tag="Halant" comment="The Bengali Hasanta (Halant / Virama)" />
    <class name="B" from-tag="Anusvara" comment="The Bengali Onushshar (Anusvara)" />
    <class name="X" from-tag="Visarga" comment="The Bengali Bisarga (Visarga)" />
    <class name="D" from-tag="Candrabindu" comment="The Bengali Candrabindu" />
    <class name="C2" from-tag="C2" comment="Any Bengali consonant from set C2" />
    <!--Whole label evaluation and context rules go here-->
    <rule name="C-RRA" comment="NFC form of BENGALI LETTER RRA">
      <char cp="09A1 09BC" />
    </rule>
    <rule name="C-RHA" comment="NFC form of BENGALI LETTER RHA">
      <char cp="09A2 09BC" />
    </rule>
    <rule name="C-YYA" comment="NFC form of BENGALI LETTER YYA">
      <char cp="09AF 09BC" />
    </rule>
    <rule name="C" comment="Section 7, WLE1: All consonants in the LGR repertoire; single code points and sequences">
      <choice>
        <class by-ref="C-single" />
        <rule by-ref="C-RRA" />
        <rule by-ref="C-RHA" />
        <rule by-ref="C-YYA" />
      </choice>
    </rule>
    <rule name="leading-combining-mark" comment="Default WLE rule matching labels with leading combining marks &#x235F;">
      <start />
      <union>
        <class property="gc:Mn" />
        <class property="gc:Mc" />
      </union>
    </rule>
    <rule name="follows-only-C" comment="Section 7, WLE 2: H: must be preceded by C ; WLE 3: M: must be preceded by C">
      <look-behind>
        <rule by-ref="C" />
      </look-behind>
      <anchor />
    </rule>
    <rule name="follows-only-V-C-M" comment="Section 7, WLE 4: D: must be preceded by any of V, C, M">
      <look-behind>
        <choice>
          <class by-ref="V" />
          <rule by-ref="C" />
          <class by-ref="M" />
        </choice>
      </look-behind>
      <anchor />
    </rule>
    <rule name="follows-only-V-C-M-D" comment="Section 7, WLE 5: X: must be preceded by any of V, C, M, D; WLE 6: B: must be preceded by any of V, C, M, D">
      <look-behind>
        <choice>
          <class by-ref="V" />
          <rule by-ref="C" />
          <class by-ref="M" />
          <class by-ref="D" />
        </choice>
      </look-behind>
      <anchor />
    </rule>
    <rule name="P" comment="Ra-Hasanta, defined for use in WLE-7">
      <class by-ref="C2" />
      <class by-ref="H" />
    </rule>
    <rule name="follows-only-V-C-M-D-B-X-P" comment="Section 7, WLE 7: Khanda Ta must be preceded by V, C, M, D, B, X, P">
      <look-behind>
        <choice>
          <class by-ref="V" />
          <rule by-ref="C" />
          <class by-ref="M" />
          <class by-ref="D" />
          <class by-ref="B" />
          <class by-ref="X" />
          <rule by-ref="P" />
        </choice>
      </look-behind>
      <anchor />
    </rule>
    <rule name="follows-H" comment="Section 7, WLE 8: V cannot be preceded by H, WLE 9: S cannot be preceded by H">
      <look-behind>
        <class by-ref="H" />
      </look-behind>
      <anchor />
    </rule>
    <rule name="no-mix-09B0-09F0" comment="Section 7, WLE 10: U+09B0 and U+09F0 cannot be mixed.">
      <choice>
        <rule>
          <char cp="09B0" />
          <any count="0+" />
          <char cp="09F0" />
        </rule>
        <rule>
          <char cp="09F0" />
          <any count="0+" />
          <char cp="09B0" />
        </rule>
      </choice>
    </rule>
    <rule name="at-end-of-label" comment="Matches code point or sequence at the end of the label &#x235F;">
      <anchor />
      <look-ahead>
        <end />
      </look-ahead>
    </rule>
    <!--Action elements go here - order defines precedence-->
    <action disp="invalid" match="leading-combining-mark" comment="labels with leading combining marks are invalid &#x235F;" />
    <action disp="invalid" match="no-mix-09B0-09F0" comment="WLE 10: U+09B0 and U+09F0 cannot be mixed." />
    <action disp="invalid" any-variant="out-of-repertoire-var" comment="any variant label with a code point out of repertoire is invalid &#x235F;" />
    <action disp="blocked" any-variant="blocked" comment="any variant label containing blocked variants is blocked &#x235F;" />
    <action disp="allocatable" all-variants="allocatable" comment="variant labels with all variants allocatable are allocatable &#x235F;" />
    <action disp="valid" comment="catch all (default action) &#x235F;" />
  </rules>
</lgr>