This document specifies an element LGR for a specific script that forms part of an integrated set of Label Generation Rules for the Root Zone. For more details on the Root Zone LGR and their development see "Root Zone Label Generation Rules - LGR-1: Overview and Summary", Integration Panel, 24 February 2016 [LGR-1].
For more details on this element LGR and its development see TF-AIDN, "Proposal for Arabic Script Root Zone LGR", Version 3.4, 2015 November 18 [Proposal].
The repertoire for this element LGR for the Arabic script is based on Section 3.2 in [Proposal] and only includes code points used by languages that are actively written in the Arabic script. It excludes code points for which TF-AIDN was unable to find sufficient evidence of use (see Appendix F in [Proposal]). The repertoire is based on [MSR-2], which is a subset of Unicode 6.3 [Unicode 6.3].
This LGR does not include combining marks or code point sequences. All combining marks have been excluded for these reasons:
First, they can significantly overproduce and would require additional rules to contain them effectively, complicating the design.
Second, even where they are required for some languages, they are optional for others.
Third, this also circumvents the issue raised by [IAB].
As part of the Root Zone, this LGR includes neither digits nor the HYPHEN-MINUS.
For further details, see Section 3.2 "Code point repertoire included", in [Proposal].
Each code point or range is tagged with the script or scripts that the code point is used with, and one or more references documenting sufficient justification for inclusion in the repertoire, see "References" below.
This LGR includes "blocked" and "allocatable" variants, assigned according to Section 4 "Final recommendation of variants for Top Level Domains (TLDs)" in [Proposal]. These recommendations balance the desire to minimize the number of possible allocatable variants with the need to keep the definition of variants simple. See also the comments given in the listing.
This LGR includes Whole Label Evaluation rules specific to the Arabic script. See Section 5 "Whole Label Evaluation (WLE) rules", in [Proposal]. As specified, the rules serve to prevent the mixing of two variants of the same code point within the same label. This has the effect of reducing overproduction of variant labels. See also the comments given for each rule or action.
The LGR includes the set of required default WLE rules and actions applicable to the Root Zone and defined in [MSR-2]. They are marked with ⍟.
The proposal for an Arabic Script Root Zone LGR [Proposal] that this LGR is based on, was developed by the Task Force for Arabic Script IDNs [TF-AIDN], based on multiple open public consultations.
For more information and for methodology and contributors see [Proposal].
In the listing of the repertoire, references [0] to [12] refer to Unicode Standard versions in which the corresponding code points were initially encoded. References [100] and above correspond to sources justifying the inclusion of the corresponding code points. Single code point or ranges may have multiple source reference values.
In addition the following references are cited in this document:
For more details for references [100] and up and [0] and up refer to the Table of References below.
]]>