Skip to main content

Whois Registrant Identification Study, Draft Report

Comment/Reply Periods (*) Important Information Links
Comment Open: 15 February 2013
Comment Close: 9 March 2013
Close Time (UTC): 01:00 Public Comment Announcement
Reply Open: 10 March 2013 To Submit Your Comments (Forum Closed)
Reply Close: 31 March 2013 View Comments Submitted
Close Time (UTC): 23:00 UTC Report of Public Comments
Brief Overview
Originating Organization: GNSO
Categories/Tags: Policy Processes
Purpose (Brief): This study, conducted by NORC at the University of Chicago, uses Whois to classify entities that register gTLD domain names, including natural persons, legal persons, and Privacy/Proxy service providers. Using associated Internet content, the study classifies entities using those domains, and observed potentially commercial activities. Findings will help the community understand how Registrants identify themselves in Whois.
Current Status: This Public Comment solicitation represents an opportunity for the community to consider the study results detailed in this draft report, ask questions, and request clarifications. In parallel, ICANN and NORC will conduct Webinars to facilitate feedback by summarizing this study's purpose, methodology, key findings, and conclusions.
Next Steps: NORC will consider all comments submitted to this Public Comment forum during the comment period, incorporate any needed clarifications, and then publish a final version of this WHOIS Registrant Identification study report. Afterwards, the GNSO Council will use this report to inform future Whois policy-making.
Staff Contact: Barbara Roseman Email:
Detailed Information
Section I: Description, Explanation, and Purpose
The WHOIS Registrant Identification Study uses Whois to classify entities that register gTLD domain names, including natural persons, legal persons, and Privacy/Proxy service providers. Using associated Internet content, it then classifies entities using those domains and potentially commercial activities. NORC at the University of Chicago has been selected to conduct this study and has issued a draft report, which is now available for public comment.
Section II: Background

As part of its effort to develop a comprehensive understanding of the gTLD Whois system, the GNSO Council expressed an interest in conducting an in-depth study of how entities that register and use gTLD domain names identify themselves in Whois. At the GNSO's request, ICANN issued an RFP in October 2009 describing a study to examine the extent to which domains used by legal persons or for commercial purposes (1) are not clearly identified as such in Whois Registrant data and (2) are correlated to use of Privacy and Proxy services.

After considering RFP responses received in late 2009 from researchers willing to undertake that Registrant Identification study, along with significant concerns raised by GNSO Council members regarding the above-stated study hypothesis, the GNSO Council decided to revamp the study's goals and approach.

In May 2010, a revised study was approved by the GNSO Council and awarded to NORC at University of Chicago. This exploratory study – detailed by this draft study report – seeks a more foundational understanding of the types of entities and kinds of activities observed in gTLD domains, including (but not exclusively focused on) those registered using Privacy or Proxy services. Accordingly, the categories of entities and activities to be studied were not pre-determined, but rather generated as researchers examined representative samples of active websites and related Whois data.

Study findings are intended to provide raw data needed to understand how entities that register and use gTLD domain names identify themselves in Whois, including (but not limited to) domains registered Privacy/Proxy services and domains engaged in potentially commercial activity. This empirical data will not only enable Council to respond to GAC questions, but will also create a baseline for evaluating potential Whois policy changes.

Section III: Document and Resource Links
Section IV: Additional Information
Additional WHOIS studies are now being conducted at the request of the GNSO Council, as summarized by:

(*) Comments submitted after the posted Close Date/Time are not guaranteed to be considered in any final summary, analysis, reporting, or decision-making that takes place once this period lapses.

Domain Name System
Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as"""" is not an IDN."