Public Comment
closed String Similarity Evaluation Data for New gTLD Program: Next Round
CategoryTechnical
Requesters ICANN org
Outcome
The ICANN organization appreciates the comments submitted by the community on the String Similarity Evaluation (SSE) Data. ICANN org received 11 submissions. The common themes of the overall feedback include: (1) supporting that the SSE Data brings transparency and predictability to the string evaluation process, (2) requesting that the SSE Tool should be made available to the community, and (3) suggesting that the SSE Data should have a mechanism for periodic review and updating.
The feedback on the SSE Data includes: two submissions suggesting potential additional cross-script pairs, and two submissions suggesting potential pairs to be removed. There is no specific feedback regarding a change to the similarity category, but a suggestion to select a more conservative category when applicable. Two submissions suggest editorial updates to the overview document. All comments will be analyzed in consultation with the relevant script experts and incorporated in any required updates to the finalized SSE Data.
What We Received Input On
In preparation for the New gTLD Program: Next Round, ICANN org published a public comment in February 2024 asking for feedback on detailed guidelines for conducting the string similarity review process. The guidelines proposed that a tool and similarity data would be used for calculation of the potential similarity for the applied-for strings (and their variant strings) for pre-screening. This will assist the SSE Panel in conducting the subsequent independent manual evaluation.
The similarity data is now being published for public comment. Feedback is requested on the proposed similar code points, as well as their proposed degree of similarity. Input is also requested to identify additional code points, if any, that should be included as being similar to each other.
| Proposals For Your Input |
|---|
Background
The Generic Names Supporting Organization (GNSO) Final Report on the New gTLD Subsequent Procedures Policy Development Process recommends conducting the String Similarity Evaluation as part of the New gTLD Program: Next Round application evaluation process. The objective of this review is to prevent user confusion and loss of confidence in the DNS resulting from the delegation of similar strings.
To guide the SSE Panel, an initial version of String Similarity Review Guidelines has been developed and published for public comment. As part of these guidelines, it was proposed that to assist the independent SSE Panel in its evaluation of string similarity for applied-for strings and their variant strings (which could result in a large number of comparisons), ICANN would develop a pre-screening SSE Tool, which will use similarity data across the different scripts, and make the tool output available to the SSE Panel.
ICANN subsequently proceeded to gather SSE Data for the New gTLD Program: Next Round string similarity evaluation. This data required script-specific knowledge from the respective script experts. The data gathered cover consideration for the full repertoire of the Root Zone Label Generation Rules version 6 (RZ-LGR-6).
When the two code points could be similar, or transitively detected as similar based on expert input, they are included in the SSE Data to ensure the potential visually similar strings are presented to the SSE Panel. The final contention sets will be determined by the SSE Panel after manual review.
Details of methodology and analysis of the SSE Data are available in the String Similarity Evaluation Data for the New gTLD: Next Round. The community is requested to read it before reviewing the SSE Data, as it presents details to interpret the data in the XML/HTML files.

