Skip to main content

SAC 023 | Is the WHOIS Service a Source for email Addresses for Spammers?

[PDF, 960 KB]

This SSAC study on WHOIS considers whether the WHOIS service is a source of email addresses for spammers. For the study, SSAC registered and monitored email delivery to randomly composed strings as second-level labels in four Top Level Domains: COM, DE, INFO, and ORG. The domain names were registered in February 2007. The recipient chosen for the registrant email address for each of the registration records was also chosen randomly. These were neither used in correspondence nor published electronically in any form (web, IM user, online service...). Thus, the only practical vectors to obtain these specific email addresses other than brute force derivation (or guessing) was via a WHOIS service or through the registrar or reseller in whose database(s) the email address were stored.

SSAC collected and analyzed all email messages delivered to these addresses for a period of approximately three months. Based on the data collected, the Committee finds that the appearance of email addresses in response to WHOIS queries is indeed a contributor to the receipt of spam.The data SSAC analyzed illustrate that the appearance of email addresses in responses to WHOIS queries virtually assures spam will be delivered to these email addresses. The Committee members involved in the WHOIS study do not believe, however, that the WHOIS service is the dominant source of spam.

SSAC concludes from its study that registries and registrars that implement anti-abuse measures such as rate- limiting, CAPTCHA, non-publication of zone file data and similar measures can protect WHOIS data from automated collection. Further, SSAC noted that anti-spam measures provided with domain name registration services are effective in protecting email addresses not published anywhere other than the WHOIS from spam.

Domain Name System
Internationalized Domain Name ,IDN,"IDNs are domain names that include characters used in the local representation of languages that are not written with the twenty-six letters of the basic Latin alphabet ""a-z"". An IDN can contain Latin letters with diacritical marks, as required by many European languages, or may consist of characters from non-Latin scripts such as Arabic or Chinese. Many languages also use other types of digits than the European ""0-9"". The basic Latin alphabet together with the European-Arabic digits are, for the purpose of domain names, termed ""ASCII characters"" (ASCII = American Standard Code for Information Interchange). These are also included in the broader range of ""Unicode characters"" that provides the basis for IDNs. The ""hostname rule"" requires that all domain names of the type under consideration here are stored in the DNS using only the ASCII characters listed above, with the one further addition of the hyphen ""-"". The Unicode form of an IDN therefore requires special encoding before it is entered into the DNS. The following terminology is used when distinguishing between these forms: A domain name consists of a series of ""labels"" (separated by ""dots""). The ASCII form of an IDN label is termed an ""A-label"". All operations defined in the DNS protocol use A-labels exclusively. The Unicode form, which a user expects to be displayed, is termed a ""U-label"". The difference may be illustrated with the Hindi word for ""test"" — परीका — appearing here as a U-label would (in the Devanagari script). A special form of ""ASCII compatible encoding"" (abbreviated ACE) is applied to this to produce the corresponding A-label: xn--11b5bs1di. A domain name that only includes ASCII letters, digits, and hyphens is termed an ""LDH label"". Although the definitions of A-labels and LDH-labels overlap, a name consisting exclusively of LDH labels, such as"""" is not an IDN."