This file specifies a reference set of Label Generation Rules for Korean using a limited repertoire as appropriate for a second level domain.
All references converge on the 26 basic ASCII Latin letters (a to z) and the 11,172 Hangul syllables contained in Unicode since version 2.0. These Hangul Syllables are sometimes called Johab, the name originating from the initial standard in which they were defined: KSC C 5601-1992 , and the encoding in which they were represented in that standard. One part of that standard also defines a subset (known as Wansung) which consists of 2,350 Hangul syllables. KSC C5601-1992 later became KS X 1001:2004.
The text in [700] recommends to only use the 2,350 Wansung code points, but given the large deployment of platforms supporting the full Johab repertoire, this recommendation is considered unnecessary in the context of this LGR.
There is no established practice of allowing Korean ideographs (Hanja) derived from China ideographs (Hanzi) in IDNA labels. Hanja characters are rarely used in Korea (North or South). Therefore it does not seem necessary to add them in a 2nd level reference LGR at this point.
Unlike many other non-Latin 2nd level reference LGRs, the Korean LGR includes the basic ASCII Latin set (a to z) because it is common practice in Korean text to mix Hangul and ASCII. Therefore it does not create confusability or additional security risks in the context of a second level LGR for the Korean language. It is also supported by current IDNA practice, see [700].
None.
None.
None.
This LGR defines no named character classes.
Default rules:
Hyphen Restrictions (no leading/ending hyphen and no hyphen in 3-4 position). These restrictions are described in section 4.2.3.1 of RFC5891: http://tools.ietf.org/html/rfc5891. They are implemented here as context rule on U+002D.
Leading Combining Marks (no leading combining mark). This rule is described in section 4.2.3.2 of RFC5891: http://tools.ietf.org/html/rfc5891.
Actions included are the default actions for LGRs as well as those needed to invalidate labels with misplaced combining marks.
This reference LGR for Korean for the 2nd Level has been developed by Michel Suignard and Asmus Freytag, verified in expert reviews by Lu Qin and Wil Tan, and based on multiple open public consultations.
General references for the language:
Wikipedia: "Korean language", https://en.wikipedia.org/wiki/Korean_language
Omniglot: Korean http://www.omniglot.com/writing/korean.htm
In the listing of the repertoire by code point, references starting from [0] refer to the version of the Unicode Standard in which the corresponding code point was initially encoded. Other references (starting from [100]) document usage of code points. For more details, see the Table of References below.
]]>