Overview

This file contains Label Generation Rules (LGR) for the Oriya script as would be appropriate for the Root zone. For more details on this proposal see "Proposal for Generation Panel for Oriya Scripts Label Generation Ruleset for the Root Zone [Proposal]". The format of this file follows [RFC 7940].

Repertoire

According to Section 5, "Repertoire" in [Proposal], the Oriya LGR contains 62 unique code points."

The repertoire is based on [MSR], which is a subset of Unicode 6.3 [Unicode 6.3].

Each code-point has associated Glyph, Character Name, Language with EGIDS, Category, and Reference.

Variants

According to Section 6 "Variants", in "[Proposal]", this LGR defines two sets of cross-script variants with Myanmar script.

Variant Disposition: All variants are of type “blocked”, making labels that differ only by these variants mutually exclusive: whichever label containing either of these variants is chosen earlier, the other one equivalent variant label should be blocked. There is no preference among these variants.

Character Classes

The basic characters in Oriya are classified into eight main categories. They are Consonants, Vowels, Matra, Halant, Nukta, Visarga, Candrabindu and Anusvara.

Consonant: The type of writing system of Oriya is syllabic alphabet in which all consonants have an inherent vowel. Diacritics, which can appear above, below, before or after the consonant they belong to, are used to change the inherent vowel. More details in Section "3.5 Structured consonants" and Section "3.6 Unstructured consonants" of the [Proposal].

Matra sign (Dependent Vowel): It is used to represent a vowel sound that is not inherent to the consonant. Dependent vowels are referred to as "matras". They are always depicted in combination with a single consonant, or with a consonant cluster. More details in Section "3.12 Matra sign: (Dependent Vowel)" of the [Proposal].

Halant: A Halant, also known as Virama, is used after a consonant to "strip" it of its inherent vowel. The Halant form of a consonant is the form produced by adding the Halant, encoded as U+0B4D ( ୍ ) ORIYA SIGN VIRAMA to the nominal shape. A Halant follows all but the last consonant in every Oriya syllable. More details in Section 3.7, "The Implicit Vowel Killer Halant" of the [Proposal].

Nukta: The nukta sign ( ଼) is used in Oriya language just like many other scripts used in South Asia. It can be commonly used with “ଡ” U+0B21, “ଢ” U+0B22. More details in Section "3.8 Nukta" of the [Proposal].

Visarga: U+0B03 ORIYA SIGN VISARGA is frequently used in Sanskrit and represents a sound very close to /h/. More details in Section 3.9, "Visarga & Avagraha" of the [Proposal].

Nasalization:

Candrabindu: Candrabindu denotes nasalization of the preceding vowel and consonants as in ଅଁଳା /ãala/name of seasonal fruit (U+0B05 U+0B01 U+0B33 U+0B3E). Oriya users commonly use it for writing the words and sounds of Sanskrit language. More details in Section "3.10 Nasalization: Candrabindu" of the [Proposal].

Anusvara: Anusvara replaces a conjunct group of a Nasal Consonant+Halant+Consonant belonging to that particular varga (plosive). The Anusvara represents a homorganic nasal. Before a non-varga (non-plosive) consonant, the Anusvara represents a nasal sound. More details in Section "3.11 Anusvara" of the [Proposal].

Whole Label Evaluation (WLE) rules

Default Whole Label Evaluation Rules

The LGR includes the set of required default WLE rules and actions applicable to the Root Zone and defined in [MSR]. They are marked with ⍟.

Oriya specific Rules

These rules have been formulated so that they can be adopted for LGR specification.

Following symbols are used in the WLE rules:
C → Consonant
M → Matra
V → Vowel
B → Anusvara
H → Halant
N → Nukta
C1 → Consonants used with Nukta
X → Visarga
D → Candrabindu

The rules are:

1. N: must be preceded only by C1
2. B: must be preceded by V, C, N or M
3. X: must be preceded by V, C, N or M
4. D: must be preceded by V, C, N or M
5. H: must be preceded by C or N
6. M: must be preceded by C or N

More details in Section "7 Whole Label Evaluation Rules (WLE)" of the [Proposal]

Overall Development Process and Methodology

The Neo-Brahmi Generation Panel (NBGP) has been formed by members having experience in linguistics and computational linguistics. Under the Neo-Brahmi Generation Panel, there are nine scripts belonging to separate Unicode blocks. Each of these scripts have been assigned a separate LGR; however the Neo-Brahmi GP ensured that the fundamental philosophy behind building those LGRs are all in sync with all other Brahmi derived scripts.

NBGP considered all the languages with EGIDS scale 1 to 4 and found that Oriya script is being used in other spoken languages.

References

References [0] to [6] refer to the Unicode Standard versions in which corresponding code points were initially encoded. Reference [101] and up correspond to sources given in [Proposal] for justifying the inclusion of for the corresponding code points. Single code point or ranges may have multiple source reference values.

In addition, the following references are cited in this document:

[MSR]: Integration Panel, "Maximal Starting Repertoire — MSR-4 Overview and Rationale", 7 February 2019, https://www.icann.org/en/system/files/files/msr-4-overview-25jan19-en.pdf
[NBGP]: Neo-Brahmi Generation Panel
[Proposal]: Neo-Brahmi Generation Panel, "Proposal for a Oriya Script Root Zone Label Generation Rule-set ", 6 March 2019, https://www.icann.org/en/system/files/files/proposal-oriya-lgr-06mar19-en.pdf
[RFC 7940]: Davies, K. and A. Freytag, "Representing Label Generation Rulesets Using XML", RFC 7940, August 2016, http://www.rfc-editor.org/info/rfc7940.
[Unicode 6.3]: The Unicode Consortium. The Unicode Standard, Version 6.3.0, (Mountain View, CA: The Unicode Consortium, 2013. ISBN 978-1-936213-08-5) http://www.unicode.org/versions/Unicode6.3.0/

For more details for references [101] and up and [0] and up refer to the Table of References below.

]]> The Unicode Standard 1.1 The Unicode Standard 4.0 Omniglot, "Oriya" https://www.omniglot.com/writing/oriya.htm Wikipedia, "Odia (Oriya) alphabet" https://en.wikipedia.org/wiki/Odia_alphabet Wikipedia. "Odia language" https://en.wikipedia.org/wiki/Odia_language Wikipedia, "Oriya (Unicode block)" https://en.wikipedia.org/wiki/Oriya_(Unicode_block) Odisha State Govt. Primary School Grade 1 e-book “HasaKhela”: by Odisha Primary Education Programme Authority http://opepa.odisha.gov.in/website/Download/e-Text-Book/CLass%20I/Hasa%20Khela%20Part%20II/Haso%20Khelo-II-Page-112.pdf