This study, conducted by Carnegie Mellon University’s Cylab (CMU), examines the extent to which public Whois contact information for gTLD domain names is misused (i.e. harmful actions such as spam, phishing, identity theft or data theft are taken using gTLD registration data).
The findings from the study provide empirical data needed by the ICANN community to assess community concerns about misused Whois contact information, identify the most common forms of misuse, and highlight the effectiveness of anti-harvesting measures in reducing misuse. The findings will also inform future policy development by ICANN and the GNSO in relation to improvements to the Whois system.
Section I: Description and Explanation
Having concluded that a comprehensive, objective and quantifiable understanding of key factual issues regarding the gTLD Whois system would benefit future GNSO policy development efforts, the GNSO Council in March 2009 requested ICANN staff to research the feasibility and cost of studying several high priority aspects of Whois. In September 2010, the GNSO Council approved this Whois Misuse study. The purpose of this study was to attempt to prove or disprove the following hypothesis: Public access to WHOIS data leads to a measurable degree of misuse – that is, to actions that cause actual harm, are illegal or illegitimate, or otherwise contrary to the stated legitimate purpose.
The overall study consisted of two related studies. First, the research team surveyed (1) registrants of a representative sample of domain names registered in the top five gTLDs – .biz, .com, .info, .net and .org; (2) registries and registrars associated with registration of the surveyed domain names to identify Whois anti-harvesting mechanisms they employ; and (3) cybercrime researchers and law enforcement organizations to gather examples and statistics related to harmful acts attributed to Whois misuse. Secondly, the research team designed and conducted an experiment to measure Whois misuse by registering 400 domains across 16 registrars, associating unique, synthetic Whois contact information with test domains and monitoring incidents of misuse for six months.
This draft report summarizes the various project activities, methodology, sampled data and findings of the research team.
The GNSO Council is now seeking community review and feedback on the draft report. The purpose of this Public Comment period is to ensure that study results have been communicated clearly and to solicit feedback on desired clarifications (if any).
Section II: Background
As part of its effort to develop a comprehensive understanding of the gTLD Whois system, the GNSO Council had chartered a number of Working Groups and Drafting Teams to develop various possible hypotheses for studies to be performed in relation to several key aspects of Whois. These efforts include the Whois Working Group chartered in 2007 and work done in 2008 by the Whois Studies Working Group, the Whois Hypothesis Working Group and the Whois Study Drafting Team. At the GNSO Council's request, ICANN issued a Request for Proposal (RFP) in September 2009 and related Terms of Reference describing a study to analyze different types of Whois misuse reported by registrants (e.g. spam, phishing, identity theft and data theft), to determine which occurs most often and is most impactful on registrants, and to correlate these findings with anti-harvesting measures that registries and registrars apply to Whois queries (e.g. rate limiting or the use of CAPTCHA phrases). Because of limitations of particular study methods, the study was to consist of two complementary approaches: a descriptive (survey) and an experimental study. The descriptive study would document and analyze Whois misuse incidents (i.e. harmful acts) that have already occurred, while the experimental study would simulate and record misuse to measure more reliably the impact of making Whois data public and of measures applied to deter data harvesting.
After considering RFP responses received from researchers willing to undertake this Whois Misuse study, in March 2010 ICANN staff reported [PDF, 488 KB] to the GNSO Council that it was not clear whether it would be possible to either quantitatively or qualitatively assess the extent to which Whois misuse is "significant", although it was possible to measure and categorize many different types of harmful acts often attributed to the use of Whois data. In September 2010, the GNSO Council decided to proceed with the Whois Misuse study in the manner described in ICANN staff's March report. In April 2011, ICANN announced that CMU had been selected to conduct the study.
The findings from this study are intended to provide empirical data needed to assess the ICANN community's concerns over the use of public Whois data to conduct harmful acts. This empirical data is intended to inform ICANN's policy work on the Whois system, including future policy development work by the GNSO.
Section III: Relevant Resources
Whois Misuse Study Draft Report [PDF, 1.15 MB]
Section IV: Additional Information
Whois Misuse Study Terms of Reference [PDF, 167 KB]
ICANN Staff Update on Whois Studies [PDF, 488 KB]
Additional Whois studies have also been conducted at the request of the GNSO Council, as summarized at: http://gnso.icann.org/issues/whois/