Submission

1. Are the individual script proposals created by the Generation Panels (available here) correctly integrated into RZ-LGR-5?

Yes

2. In your view, are there any required technical changes to RZ-LGR-5? Please list them with an explanation.

3. Do you have any additional observations or suggested changes?

For Bangla (Bengali) LGR-5, I would like to address five major issues:

Issue 01: In Bangladesh, at least four languages are written in Bangla script. The names of the languages are Hajong, Koda, Sadri, and Manipuri. Among them, Hajong and Sadri have formal educational materials published by Government.

Issue 02: Theoretically both the অ্যা and এ্যা are not glyphs and not enlisted in the Unicode code chart. এ্যা is a character in conjunct/combination form which is used to transcribe some words in a wrong way ( in misspelled form) such as এ্যান্ড (and), এ্যাকাডেমি (academy). On the other hand, the অ্যা is the correct form to transcribe. It important point is, অ্যা is a phoneme, not a single vowel (orthographically), even though it is not included in the traditional phonological chart and Unicode. However, the character combination অ্যা could be included with a note. And thus, এ্যা should be excluded from the rapotearei.

Issue 03: The tag ‘Matra’ is highly ambiguous to any Bengali reader, or user, for themselves it is ‘Kar’. If possible, the word ‘kar’ might be written with a parenthesis: Marta (kar). Please be noted that for Bengali users the word matra’ is used for an upper line to maintain the paragraph space.

Issue 04: Also reclaimed for 09DC, 09DD, 09DF codepoints for the corresponding characters.

Issue 05: In the provided document some words are showing valid such as এ্যাৎ, র্পৎ, হ্যাঁংচা, etc. These are non-words and we should address what types of impossible words we can generate through the valid characters. We should not allow such types of words that will be treated as noise, and grammatically impossible. It would be useful if some wordlist could be added to an appendix as an exhaustive list, which will be treated as LGR-supported Bangla words.

Summary of Attachment

For Bangla (Bengali) LGR-5, I would like to address five major issues:

Issue 01: In Bangladesh, at least four languages are written in Bangla script. The names of the languages are Hajong, Koda, Sadri, and Manipuri. Among them, Hajong and Sadri have formal educational materials published by Government.

Issue 02: Theoretically both the অ্যা and এ্যা are not glyphs and not enlisted in the Unicode code chart. এ্যা is a character in conjunct/combination form which is used to transcribe some words in a wrong way ( in misspelled form) such as এ্যান্ড (and), এ্যাকাডেমি (academy). On the other hand, the অ্যা is the correct form to transcribe. It important point is, অ্যা is a phoneme, not a single vowel (orthographically), even though it is not included in the traditional phonological chart and Unicode. However, the character combination অ্যা could be included with a note. And thus, এ্যা should be excluded from the rapotearei.

Issue 03: The tag ‘Matra’ is highly ambiguous to any Bengali reader, or user, for themselves it is ‘Kar’. If possible, the word ‘kar’ might be written with a parenthesis: Marta (kar). Please be noted that for Bengali users the word matra’ is used for an upper line to maintain the paragraph space.

Issue 04: Also reclaimed for 09DC, 09DD, 09DF codepoints for the corresponding characters.

Issue 05: In the provided document some words are showing valid such as এ্যাৎ, র্পৎ, হ্যাঁংচা, etc. These are non-words and we should address what types of impossible words we can generate through the valid characters. We should not allow such types of words that will be treated as noise, and grammatically impossible. It would be useful if some wordlist could be added to an appendix as an exhaustive list, which will be treated as LGR-supported Bangla words.

Summary of Submission

For Bangla (Bengali) LGR-5, I would like to address five major issues:

Issue 01: In Bangladesh, at least four languages are written in Bangla script. The names of the languages are Hajong, Koda, Sadri and Manipuri. Among them, Hajong and Sadri have formal educational materials published by Government.

Issue 02: Both a and a are not glyphs and not enlisted in the Unicode code chart. এ্যা is a character or conjunct/combination which is used to transcript some words in a wrong way ( in misspelled form) such as এ্যান্ড, এ্যাকাডেমি : the transliterated form of ‘and’, ‘academy’. On the other hand, the অ্যা a is the correct form and it’s a phoneme, and also, it’s not included in the traditional phonological chart. However, the character combination অ্যা could be included with a footnote. And thus, এ্যা should be excluded.

Issue 03: The tag ‘Matra’ is highly ambiguous to any Bengali reader, or user, for themselves it is ‘Kar’. If possible, the word ‘kar’ might be written with a parenthesis: Marta (kar). Please be noted that for Bengali users the word matra’ is used for an upper line to maintain the paragraph space.

Issue 04: Also reclaimed for 09DC, 09DD, 09DF codepoints for the corresponding characters.

Issue 05: In the provided document some words are showing valid such as এ্যাৎ, র্পৎ, হ্যাঁংচা etc. These are non-words. If we allow these, could we collect and publish an exhaustive list have the LGR supporting Bangla words.

ICANN

Get Started

News and Media

Policy

Public Comment

Resources

Community

Quicklinks