Skip to main content
Resources

Label Generation Rules Tool

The Label Generation Rules Tool, provided by ICANN, helps with reviewing IDN tables, validating labels against a Label Generation Rule (LGR), and developing and managing an LGR. The tool has three modes for each of these purposes.

The IDN Table Review Mode allows generic top-level domain (gTLD) registry operators and registry service providers to review their own IDN tables as compared with reference LGRs published by ICANN org or IDNA2008 requirements (RFC5890, RFC5891, RFC5892, RFC5893).

The Basic Mode validates labels with a selected reference LGR or Root Zone LGR (RZ-LGR).

The Advanced Mode enables users to create, use, and manage IDN tables in the LGR format. The LGR format (RFC7940) is machine-readable format that allows for a more precise definition of LGRs, making them easier to compare and reuse.

The Label Generation Rules Tool is an open source application. Parties interested in integrating this functionality into their own systems can find the source code on github: lgr-core, lgr-django, munidata, picu

If you have any questions or feedback about the tool, send an email to IDNprogram@icann.org.

Please take note of the Terms of Use provided specifically for the IDN Table Review function of the LGR Tool:

THE IDN TABLE REVIEW FUNCTION OF THE LGR TOOL COMPARES IDN TABLES WITH REFERENCE LABEL GENERATION RULESETS. THE PURPOSE OF THIS REPORT IS TO ASSIST THE USER IN IDENTIFYING POTENTIAL ISSUES EXIST IN IDN TABLES. THIS IDN TABLE REVIEW TOOL REPORT IS FOR INFORMATION ONLY. IT IS NOT A WARRANTY OR GUARANTEE OF ICANN IDN TABLE REVIEW PROCESS.

In addition, please take note of the Terms of Use provided with the Label Generation Tool more generally:

THIS SOFTWARE IS PROVIDED BY ICANN AND CONTRIBUTORS "AS IS"' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL ICANN OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Background

ICANN conducted a series of studies on potential issues related to the definition and management of IDN variant TLDs, with the assistance of six case study teams representing the Arabic, Chinese, Cyrillic, Devanagari, Greek and Latin scripts. The Integrated Issues Report identified the need for the following:

  1. A formal specification for representing Label Generation Rules, which can be used to determine valid labels and their variants in different scripts
  2. A tool to process such LGRs

To that end, ICANN participated in the development of RFC7940 - Representing Label Generation Rulesets Using XML, an IETF specification that organizes and represents label generation rules in machine-readable (XML) format. ICANN then developed the LGR Tool to assist in the creation, use and management of label generation rules according to RFC7940.

To simplify the process of validating the labels, the function was designed to check a label or a set of labels against a selected LGR, and also to see if there is any collision and possible mixed script variants.

ICANN further developed the IDN Table Review function of the LGR Tool aiming to increase the efficiency in reviewing the IDN table and promote the consistency and transparency of the reviews. The IDN Table Review function of the tool compares uploaded IDN Table in RFC3743, RFC4290, and RFC7940 format with the Reference LGRs and generates a review report in HTML report.

Domain Name System
Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as""icann.org"" is not an IDN."