Skip to main content

WHOIS Accuracy Study Preliminary Findings Reflect Improvements in Accuracy

Why this matters

In the pilot introduction of the WHOIS Accuracy Reporting System (ARS), ICANN has collaborated with inter-governmental and private sector specialists to deliver the most recent assessment of WHOIS accuracy.

In the pilot introduction of the WHOIS Accuracy Reporting System (ARS), ICANN has collaborated with inter-governmental and private sector specialists to deliver the most recent assessment of WHOIS accuracy. For the first time, ICANN has brought together experts in the fields of study design and validation services to produce an accuracy study that uses the latest technology available.

The Preliminary Findings paper [PDF, 668 KB] published today describes the result of this unique collaboration. This paper is being released by ICANN in advance of its October Los Angeles Meetings to illustrate the features of the ARS under development, as well as to solicit feedback on the approach, methodology and reporting formats. This feedback will help shape the final design of the ARS. After the Los Angeles Meeting, ICANN will publish the full Pilot Report, which provides detailed findings and explanations of the methodology deployed in the Study.

When fully developed, the ARS will produce ongoing reports capable of tracking trends in accuracy rates, and reporting on the specific factors that affect accuracy. This data may be useful to assess the effectiveness of recent efforts at improving accuracy rates, and to support ongoing policy development activities related to WHOIS.

Preliminary Findings

The Pilot study examined accuracy rates from multiple perspectives to give a realistic picture of today's WHOIS. While the study results are still being confirmed, the preliminary findings reveal that:

  • Operationally, Registrars under the 2013 RAA have more accuracy for email addresses than Registrars under the 2009 RAA;
  • New gTLDs have slightly better operational email accuracy rates than prior gTLDs;
  • Prior gTLDs have more operational accuracy on telephone numbers, but the two groups are equal on operational postal address accuracy.

These preliminary findings are being published to facilitate discussion on the methodology sample sizes, and approach at the Los Angeles Meeting. The statistics published today are subject to further analysis and confirmation, to be included in the Full Pilot Study to be published after the Los Angeles Meeting for Public Comment.

Global Expertise

The findings in the Paper are the product of an in-depth examination of postal addresses, email addresses and telephone numbers. Postal address statistics were developed with guidance from the Universal Postal Union (Switzerland), a specialized agency of the UN that coordinates postal policies worldwide for its member countries. Other validation expertise was provided by leading commercial firms, including StrikeIron (USA), utilizing its proprietary email validation systems, DigiCert (USA), a provider of digital certificates and telephone validation services, and aided by a unique data parsing service by Whibse (USA).

Study Design

ICANN turned to NORC to design, work with validation providers, and conduct the analysis necessary to produce and deliver this analysis. NORC's past accuracy studies influenced the WHOIS Review Team. The Review Team's Final Report called for ICANN to publish ongoing statistics on WHOIS accuracy.

The Pilot study examines accuracy levels by applying syntactic validation and operation validation tests to a Registrant's postal address, email, and telephone numbers listed in a WHOIS record. Although the study did not attempt to apply identity validation techniques, ICANN is exploring the feasibility of including identity validation in subsequent development phases of the ARS.

The sample sizes used in the Pilot Study is described in the chart below:

Data Element Syntactic Validation Operational Validation
Postal Address 10,000 1,000
Telephone Number 10,000 1,000
Email Address 98,821 98,821

More information on the methodology used to classify WHOIS Records is available here [PDF, 668 KB].

Engagement with Registrars and Others

A key function of the ARS will be to forward records identified as potentially inaccurate to registrars for follow-up to confirm their accuracy. The ARS is being designed to track and report on the progress of these records.

Engagement with registrars and other interested stakeholders is necessary to define an efficient process for transmitting, reviewing, and updating, as appropriate, the identified WHOIS records. ICANN plans to work with registrars and the broader Community in the months ahead in order to develop this process.

Next Steps

The purpose of the Pilot Study is to test assumptions using real data. The methodology can be adjusted based upon the public comments received on the Pilot Report. The consultations in Los Angeles provide an opportunity to highlight areas where the methodology can be improved.

ICANN invites feedback on any aspect of the Preliminary Findings paper, Pilot Study, the methodology deployed, reporting perspectives, as well as the next steps for developing the ARS. A full study report will be posted for public comment after the Los Angeles Meeting, to gather feedback on the Pilot Study and the findings. In addition, the All Things WHOIS Session will be held in Los Angeles to discuss the Pilot, along with other WHOIS developments.

Additional Information

  • Additional Information about the latest WHOIS developments will be provided in the All Things WHOIS Session in Los Angeles.
  • For the status of the improvement efforts, please see the latest Implementation Chart
  • For more details on the services sought by ICANN, please see the WHOIS Accuracy Reporting System RFP Announcement
  • For additional background information, please see the Draft Implementation Plan for the WHOIS Accuracy Reporting System Announcement
  • To learn about WHOIS, visit the WHOIS website.

More Announcements
Domain Name System
Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as""icann.org"" is not an IDN."