Community Experiences with the InterNIC Whois Data Problem Reports System
31 March 2005
Contents
Executive Summary
I. Applicable Provisions of the ICANN RAAII. Implementation of the WDPRS
III. Statistics from Operation of the WDPRS
IV. Impact of WDPRS on Improved Whois Data Accuracy
Executive Summary
This Report summarizes ICANN's experience with the operation of the Whois Data Problem Report system (WDPRS) during a 12-month reporting period that ended 28 February 2005. ICANN developed this system to receive and track complaints about inaccurate or incomplete Whois data entries. Individuals who encounter such entries may notify ICANN by completing an online form, which is then forwarded to the registrar of record for appropriate action. The WDPRS is one of the tools that ICANN uses to improve the accuracy of Whois data. Last year ICANN streamlined the system to provide for both greater automation and expanded functionality. The new system includes all gTLDs; the replaced system addressed .com, .net and .org only.
Under the WDPRS, ICANN is able to track how many reports are filed and how many of these are "confirmed" by the reporter so they may be sent to the registrar of record. After a prescribed period, ICANN asks the person filing the report to check the Whois data again, and indicate whether (i) the data was fixed; (ii) the domain name was deleted; (iii) the data was unchanged; or (iv) there is some other disposition.
On average, there were 2,865 reports confirmed each full month during the reporting period. (There were 31,533 confirmed reports total. While the report covers 12 months, only partial data is available for the first month of the reporting period).
A total of 16,941 unique domain names were the subject of the 31,533 Whois Data Problem Reports. Multiple reports for the same domain name accounted for 14,592 (one domain name received 61 reports).
Reports were submitted by 3,122 different individuals; the top 20 contributing individuals accounted for over 58% of those reports.
The analysis performed on the data indicates that more than 63% of the names reported were corrected, suspended, or are no longer registered.
The differences in methodology and reporting mechanisms from the previous reporting period make it difficult to draw significant, specific conclusions about the impact of the WDRPS on Whois data accuracy. While the numbers of reports to the WDRPS is higher than in the previous period, it is believed that this is a function of greater awareness and a simplification of the reporting process. This reflects enhanced attention paid to Whois data accuracy. To further improve our understanding of the potential impact, ICANN has added a component to this analysis by conducting a study of a random sampling of Whois data from across gTLDs.
Introduction
The following is a report summarizing ICANN's experience with the operation of the Whois Data Problem Report system ("Report") at InterNIC.net (http://wdprs.internic.net/) since publication of the previous report on 31 March 2004 (http://www.icann.org/Whois/WDPRS-report-final.pdf). These reports are published pursuant to Section II.C.10.a of Amendment 6 to the ICANN/DOC Memorandum of Understanding, which provides that:
ICANN shall publish a report no later than March 31, 2004, and annually thereafter, providing statistical and narrative information on community experiences with the InterNIC WHOIS Data Problem Reports system. The report shall include statistics on the number of WHOIS data inaccuracies reported to date, the number of unique domain names with reported inaccuracies, and registrar handling of the submitted reports. The narrative information shall include an evaluation of the impact of the WHOIS Data Problem Reports system on improved accuracy of WHOIS data (http://www.icann.org/general/amend6-jpamou-17sep03.htm).
Another report required by the same section of the MOU, entitled Implementation of the Whois Data Reminder Policy, was published on 30 November 2004 (http://www.icann.org/Whois/WDRP-Implementation-30Nov04.pdf).
Whois data for generic Top Level Domains (gTLDs) includes information about the registrant, administrative contact, technical contact, and name servers associated with each domain name. This information is used for a variety of important purposes, including resolution of technical network issues, identification and verification of online merchants, investigations by consumer protection and law enforcement authorities, enforcement of intellectual property rights, identification of sources of spam e-mail, and determinations of whether a domain name is available for registration. Whois services have been available on the Internet since the early 1980s and continue to be broadly used. According to an online survey of over 3000 participants (representing businesses, governments, ISPs, registrars, individuals, and non-commercial organizations) conducted by the ICANN Domain Name Supporting Organization in 2001, Internet users broadly consider accurate Whois data to be important and support measures to improve its accuracy http://www.dnso.org/dnso/notes/WhoisTF/20020625.TFWhois-report.htm.
I. Applicable Provisions of the ICANN Registrar Accreditation Agreement
ICANN's contracts with accredited registrars require them to obtain contact information from registrants, to provide it publicly by a Whois service, and to investigate and correct any reported inaccuracies in the contact information for names they sponsor. Several provisions of the ICANN Registrar Accreditation Agreement (RAA) (http://www.icann.org/registrars/ra-agreement-17may01.htm) relate to Whois data, including:
3.3.1 At its expense, Registrar shall provide an interactive web page and a port 43 Whois service providing free public query-based access to up-to-date (i.e., updated at least daily) data concerning all active Registered Names sponsored by Registrar for each TLD in which it is accredited. The data accessible shall consist of elements that are designated from time to time according to an ICANN adopted specification or policy. Until ICANN otherwise specifies by means of an ICANN adopted specification or policy, this data shall consist of the following elements as contained in Registrar's database:
3.3.1.1 The name of the Registered Name;
3.3.1.2 The names of the primary name server and secondary name server(s) for the Registered Name;
3.3.1.3 The identity of Registrar (which may be provided through Registrar's website);
3.3.1.4 The original creation date of the registration;
3.3.1.5 The expiration date of the registration;
3.3.1.6 The name and postal address of the Registered Name Holder;
3.3.1.7 The name, postal address, e-mail address, voice telephone number, and (where available) fax number of the technical contact for the Registered Name; and
3.3.1.8 The name, postal address, e-mail address, voice telephone number, and (where available) fax number of the administrative contact for the Registered Name.
3.7.7 Registrar shall require all Registered Name Holders to enter into an electronic or paper registration agreement with Registrar including at least the following provisions:
3.7.7.1 The Registered Name Holder shall provide to Registrar accurate and reliable contact details and promptly correct and update them during the term of the Registered Name registration, including: the full name, postal address, e-mail address, voice telephone number, and fax number if available of the Registered Name Holder; name of authorized person for contact purposes in the case of an Registered Name Holder that is an organization, association, or corporation; and the data elements listed in Subsections 3.3.1.2, 3.3.1.7 and 3.3.1.8.
3.7.7.2 A Registered Name Holder's willful provision of inaccurate or unreliable information, its willful failure promptly to update information provided to Registrar, or its failure to respond for over fifteen calendar days to inquiries by Registrar concerning the accuracy of contact details associated with the Registered Name Holder's registration shall constitute a material breach of the Registered Name Holder-registrar contract and be a basis for cancellation of the Registered Name registration.
3.7.7.3 Any Registered Name Holder that intends to license use of a domain name to a third party is nonetheless the Registered Name Holder of record and is responsible for providing its own full contact information and for providing and updating accurate technical and administrative contact information adequate to facilitate timely resolution of any problems that arise in connection with the Registered Name. A Registered Name Holder licensing use of a Registered Name according to this provision shall accept liability for harm caused by wrongful use of the Registered Name, unless it promptly discloses the identity of the licensee to a party providing the Registered Name Holder reasonable evidence of actionable harm.
3.7.8 Registrar shall abide by any specifications or policies established according to Section 4 requiring reasonable and commercially practicable (a) verification, at the time of registration, of contact information associated with a Registered Name sponsored by Registrar or (b) periodic re-verification of such information (emphasis added). Registrar shall, upon notification by any person of an inaccuracy in the contact information associated with a Registered Name sponsored by Registrar, take reasonable steps to investigate that claimed inaccuracy. In the event Registrar learns of inaccurate contact information associated with a Registered Name it sponsors, it shall take reasonable steps to correct that inaccuracy.
Based on the above provisions of the RAA, a registrar must:
ICANN has taken several steps to improve the accuracy of Whois data. These include:
II. Implementation of the Whois Data Problem Report System (WDPRS)
In order to assist registrars in complying with the contract obligations outlined above, ICANN implemented the Whois Data Problem Report System (WDPRS) on 3 September 2002. The goal of the WDPRS is to streamline the process for receiving and tracking complaints about inaccurate and incomplete Whois data, and thereby help improve the accuracy of Whois data.
Reports of inaccurate Whois data under the WDPRS are submitted through the InterNIC website, operated by ICANN as a public resource containing information relating to domain registration services. The centerpiece of the WDPRS is a centralized online form, available at http://wdprs.internic.net, for submitting reports about Whois data inaccuracies. The form requests Internet users (called "reporters" in this context) to specify the domain name they believe is inaccurate, and their name and email address. After entering "submit", the reporter is shown the Whois record for that domain name, and asked to indicate the inaccuracy. The system then sends the reporter an email request for confirmation of the report, which he has up to five days to acknowledge or the report is deleted.
Once the report is confirmed by the reporter, it is automatically forwarded to the registrar of record for handling. As of 31 March 2005, there are 468 ICANN-accredited registrars. A complete list of accredited registrars is available on the ICANN website at http://www.icann.org/registrars/accredited-list.html, and on the InterNIC website at http://www.internic.net/regist.html. (The InterNIC registrar listing can be sorted by location of registrar or by languages supported.)
Several enhancements have been made to the WDPRS since it was first implemented in 2002. A full description of the system's functionality at launch may be found in the 31 March 2004 report on "Community Experiences with the InterNIC Whois Data Problem Reports System" (http://www.icann.org/Whois/WDPRS-report-final.pdf).
III. Statistics from Operation of the WDPRS
The following sections provide a statistical summary of operation of the Whois Data Problem Report System. These statistics cover the operation of the system from the last report's cut-off date of 29 February 2004 until this year's cut-off date of 28 February 2005. It includes information concerning: (A) the number of Whois data inaccuracies reported; (B) the number of unique domain names with reported inaccuracies; and (C) registrar handling of the submitted reports. Although the type of data that is being collected has changed since the 2004 Report, and the reporting period considered in last year's report period was 50% longer (18 months instead of 12), this Report will make comparisons when appropriate.
A. Reported Data Inaccuracies
A total of 31,553 Whois Data Problem Reports were confirmed by their senders during the 12-month reporting period. (The 2004 Report indicated that 24,148 submissions had been confirmed during that 18-month reporting period.) The following table indicates the number of reports confirmed per month (the numbers for March 2004 reflect a partial month due to system changeover):
|
Date |
Reports Confirmed |
|
Mar-04 |
16 |
|
Apr-04 |
2003 |
|
May-04 |
2513 |
|
Jun-04 |
2978 |
|
Jul-04 |
3462 |
|
Aug-04 |
2998 |
|
Sep-04 |
2120 |
|
Oct-04 |
2289 |
|
Nov-04 |
3018 |
|
Dec-04 |
3594 |
|
Jan-05 |
3806 |
|
Feb-05 |
2736 |
|
Total |
31533 |
On average, there were 2865 reports confirmed each month. During the previous reporting period, there were, on average, 1,342 reports confirmed every month.
On a per TLD basis, .com represented 62% of confirmed reports, with .info and .biz constituting 14% and 12% respectively. The statistics for those and other gTLDs are included in the following table:
| TLD | Reports # | Reports % | Reports per 10000 registrations* |
| .com | 19705 | 62.49% | 5.80 |
| .info | 4332 | 13.74% | 12.92 |
| .biz | 3715 | 11.78% | 32.37 |
| .net | 2504 | 7.94% | 4.59 |
| .org | 1268 | 4.02% | 3.73 |
| .name | 6 | 0.02% | 0.42 |
*Based on registrations as of 31 December 2004
A total of 3122 different individuals submitted reports. On average, each reporter submitted approximately 10 reports, while some individuals submitted significantly more reports. Out of a total of 31533 confirmed reports, the numbers of reports per individual, for the top 20 reporters are as follows:
| Top 20 Reporters | # Reports Submitted |
| 1 | 4035 |
| 2 | 2186 |
| 3 | 1197 |
| 4 | 1183 |
| 5 | 1058 |
| 6 | 891 |
| 7 | 881 |
| 8 | 770 |
| 9 | 715 |
| 10 | 592 |
| 11 | 572 |
| 12 | 555 |
| 13 | 532 |
| 14 | 513 |
| 15 | 505 |
| 16 | 482 |
| 17 | 482 |
| 18 | 415 |
| 19 | 414 |
| 20 | 339 |
| Total | 18317 |
As this table shows, fewer than 1% of all those who filed reports (20 people) were responsible for over 58% (18,317 out of 31,533) of all Whois inaccuracy reports submitted to ICANN during the reporting period. The 2004 Report indicated that the top 20 ( 0.3%) of reporters were responsible for over 40% (9,938 out of 24,148) of Whois inaccuracy reports.
There is evidence that individuals also are reporting single domains when they discover a problem -- there were 2,363 individuals who submitted exactly one report. Of these, 2,116 were later sent a follow-up message, and 922 of those did follow-up on their report, a return rate of about 44%. (The overall average return rate was about 51%).
From both anecdotal information received by ICANN and text accompanying the body of these reports we conclude that most, if not all of the high volume reporters are driven by a concern about abuses involving spam. In well over 80% of the reports filed, the reporter indicated "spam" as a factor in the body of the report.
B. Unique Domain Names
A total of 16,941 unique domain names were the subject of Whois Data Problem Reports. As reported above, there were a total of 31,533 reports confirmed. Accordingly, just over 14592 of the reports were "duplicates", meaning that the domain name they referred to was already the subject of a report made during the reporting period.
| Top 20 Domain Names Reported | Reports per Domain Name |
| 1 | 61 |
| 2 | 50 |
| 3 | 47 |
| 4 | 42 |
| 5 | 41 |
| 6 | 38 |
| 7 | 37 |
| 8 | 36 |
| 9 | 34 |
| 10 | 34 |
| 11 | 33 |
| 12 | 33 |
| 13 | 32 |
| 14 | 31 |
| 15 | 30 |
| 16 | 30 |
| 17 | 29 |
| 18 | 28 |
| 19 | 27 |
| 20 | 27 |
In the previous reporting period, a total of 16045 unique domain names were the subject of Whois Data Problem Reports, and just over one-third of the reports were "duplicates." The following discussion of Whois accuracy focuses on number of individual domain names reported, not the total number of raw reports.
C. Registrar Handling
Under the streamlined WDPRS in effect during the reporting period, the reporter is asked to inform ICANN of the status of the data after the reported inaccuracy is forwarded to the registrar and a suitable period for action has elapsed (“follow-up report”). Of the 16,941 unique domain names reported, we have follow-up reports for 9,770, or 58%. The following table characterizes the state of the reported (by the registrar) Whois records as indicted by the follow-up reports provided to ICANN:
| Status | Domain Names | % |
| Inaccuracy Corrected | 760 | 7.8% |
| Domain Deleted | 1658 | 17.0% |
| Other | 1553 | 15.9% |
| Data Unchanged | 5799 | 59.4% |
| Total | 9770 | 100.0% |
According to self-reporting by the person originating the report, a total of 2,418 Whois records were corrected or deleted as the result of a WDPRS report, or 24.7%. A substantial 75.3% were categorized as "Other" or "Data Unchanged".
In order to better understand the nature of the domain names marked "Other" or "Data Unchanged" (7,532 total) I CANN staff individually reviewed 5,842 (about 80%) of them and made the following observations: more than half (51.6%) had in fact been deleted or suspended. Another third of them (34.9%) had Whois data that appeared to be accurate (note, however, that it is quite possible to supply Whois information that looks completely plausible, but is in fact bad). About 14% appeared incomplete or clearly inaccurate.
Combining the suspended or deleted domain names noted by ICANN staff with the user reports of corrected, suspended, or deleted domain names, we arrive at an estimate of 64% of reported domain names with bad data that were corrected suspended, or no longer registered. An additional 10% of domains with clearly bad information were not changed. This leaves approximately 26% of reported domains Whois data without obvious errors. Note the implication that at least 74% of reports are actually about bad Whois data; while 26% are for domains with plausible, but not confirmed as accurate, data.
Some registrars were particularly diligent in making changes as a result of problem reports. A close review of the registrars with 500 or more reports (18) revealed one standout: this registrar in particular had only 8% of its follow-up reports marked unchanged. This compares with the next closest registrar at 24%. Among all registrars with over 500 reports, the average was 28% and ranged as high as 37% unchanged.
There are, of course, a number of possible explanations for the relatively high number of "unchanged" dispositions. First, as the above analysis illustrates, the reporter may not be correctly interpreting the Whois data. Second, a reporter might be motivated to report "unchanged" status for various reasons. Third, the domain names in question may have been put on Registrar Hold due to inaccurate Whois data, with the data being left unchanged -- even though the registrar took appropriate action. Fourth, the format chosen by the registrar for display of Whois data may be difficult to understand.
IV. Impact of WDRPS on Improved Accuracy of Whois Data
There are several conclusions that can be drawn concerning the impact of the WDPRS.
First, it is clear that ICANN's Whois Data Problem Reports System continues to have a measurable impact on the accuracy of Whois data. Of the 16941 domains for which there are reports, based on the statistics above we can estimate that 74% (or approximately 12,500) of those domains had incorrect Whois information, and that 64% (nearly 11,000) were corrected, or are no longer active.
Second, the substantial increase in average number of reports per month is clear evidence that the visibility and usage of the system is growing.
Third, there are a number of "power users" of the system. Given that they account for more than 50% of the reports, and that at least 74% of the reports are for legitimately bad Whois information, it is reasonable to assume that these industrious individuals are indeed finding many domains with incorrect Whois information. It might be reasonable to offer features in the interface to help these users.
Fourth, while the above statistics seem to indicate that the WDPRS is fairly successful, there is no reliable baseline data for how much inaccuracy actually exists in the Whois system. ICANN has studied a small random sample of Whois data; the results are too imprecise to cite, but the effort has clearly indicated that a statistical study of the database should be undertaken.
Finally, the 16,941 reported names is a small fraction of the 49+ million gTLD registrations. Therefore, while the WDPRS tool is effective where used, there is not statistical certainty that the entire database is significantly affected. This points out the usefulness of the broader sampling indicated above and the need to promote the use of the WDRPS tool through outreach and education.
In conclusion, the WDRPS has focused attention on statistics that were not previously available. As ICANN's experience with the system and the data grows we will be in a better position to determine the impact on the much larger issue of Whois data accuracy. As a measure of ICANN's concern about the accuracy of Whois data, the organization is developing a proactive compliance program for gTLD registrars and registries. As a part of this compliance effort, ICANN is planning to actively sample and test registrar Whois data to develop a statistical model for Whois data accuracy investigations. Additional staff resources will be utilized to obtain other accurate and useful statistical data, monitor registrar and registry compliance with Whois service, privacy and accuracy obligations, the Whois Data Reminder Policy and the WDPRS.