| |
 |
ICANN Stockholm Meeting
Topic: Report of the Internationalized Domain Names Working GroupResponses
to Survey A
Posted: 29 May 2001
|
AppendixResponses
to Survey A: Technical Questions
1. What different
technologies currently enable the use of non-Latin scripts as
domain names?
| WALID |
WALID, Inc. (WALID) considers that the
deployed or proposed approaches to Internationalized Domain Names
(IDNs) have focused mainly on four approaches:
- Upgrading the DNS protocol in certain
ways to support the tagging of UTF-8 or codepage data in DNS
packets, typically using some of the as-yet unused bits in the
DNS packet header. This has been proposed in a number of IETF
Internet-Drafts;
- Sending UTF-8 or codepage data (sometimes
unmarked) on the wire using the existing protocol, and upgrading
the authoritative DNS servers to store and process that data;
- Sending UTF-8 or codepage data (sometimes
unmarked) to a DNS proxy server or other network agent, which
performs an ACE transformation on the data and then presents
the encoded name to the DNS for resolution;
- Performing ACE transformations directly
on the DNS client node, in the resolver and/or the application
layer. WALID is in favor of this approach to IDNs, and the approach
is embodied in WALID's WorldConnect technology. Recently,
other technology providers have begun to produce similar products.
|
| Verisign |
Standards to enable the use of non-Latin
scripts as domain names have not been finalized. Broadly speaking,
three different methods are currently in use:
1. Domain names are sent in a local encoding
(such as GB, BIG5, SJIS, etc.)
2. Domain names are sent in a Unicode Transformation
Format, such as UTF-8.
3. Domain names are converted to a "safe"
representation using only the subset of ASCII characters currently
supported by the Internet's infrastructure before sending.
VeriSign Global Registry Services (VeriSign
GRS) will conform its testbed to whatever standard emerges from
the IETF process.
|
| Neteka |
There are in general three approaches to
multilingualize the domain name system:
- Brute Force Approach - the DNS was designed to transport domain name
characters in unsigned octets. Therefore, the protocol itself
is actually capable of carrying 8bit information. The reason
for restricting it to the US-ASCII scheme is simply for backward
compatibility issues at the time it was devised. While the DNS
has been arbitrarily constrained to English only alphanumeric
names, implementers were not advised to reject names outside
of the constraints, because the DNS will ultimately determine
the existence of a domain name through its hierarchical search.
It is possible therefore to force 8bit character information
(such as UTF-8 or local encoding schemes: Big5, GB, JIS, etc.)
through the DNS and existing implementation experiments indicate
that root servers and middle-wares are usually unaffected.
- Protocol Extension Approach - An approach to solve the multilingual DNS problem
is to introduce additional flagging or tags within the DNS packet
to alert servers of the encoding scheme used by the request.
Whether it is an encoding tag or simply a multilingual flag,
the protocol approach utilizes the bit format within the existing
DNS packet to notify the receiving end of the context of the
domain name in question. A number of drafts have been proposed
at the Internet Engineering Task Force (IETF) discussion on multilingual
domain names, including the DNSII mechanism [DNSII-MDNP], utilization
of EDNS to signify multilingual labels [IDNE], use of the reserved
header bits [UDNS] as well as the introduction of a new DNS class
for universal characters [ClassUC]. While there are relative
advantages for the different proposals, DNSII supports the use
of multiple encoding schemes, IDNE utilizes the newly standardized
EDNS mechanism for DNS extensions, UDNS makes it possible for
information to be tunneled all the way to the authoritative server
and the introduction of Class UC could effectively create a coherent
but entirely new namespace.
- ASCII Conversion Approach - The allocation of pain, or in other words where
most effort for the use of multilingual domain names should be
put drives the discussion towards an ASCII conversion approach
whereby the legacy servers and transportation protocol does not
need to change and that all multilingual character information
are transformed into ASCII Compatible Encoding (ACE) formatted
DNS requests. The assumption for an ASCII conversion approach
is that all existing users on the Internet would upgrade to a
multilingual aware DNS resolver, which would perform a standardized
transformation mechanism to transform multilingual characters
into alphanumeric strings that would fit into the original DNS
specifications. In addition to transforming the multilingual
characters to ASCII strings, an in-label identifier is usually
appended to the domain name. For example, the Row-based ASCII
Compatible Encoding [RACE] scheme, calls for the use of a "bq--"
prefix. The common objective for all ACE schemes is to represent
multilingual characters in alphanumeric form, fitting names within
the current character constraints of the DNS. Each has a slightly
different transformation mechanism, from a simple hex dump such
as TRACE [DNSII-TRACE], to multiple compression scheme approach
such as RACE.
Besides the three main approaches, it is
also possible to have a hybrid approach that mixes and matches
the different technologies. It is also Neteka's opinion that
the best strategy forward to deploy multilingual domain names
consists of a hybrid of all three approaches and contemplates
a phased transition:
- Short-term:
Brute force approach - with the brute force approach, registries
could immediately offer functional multilingual domain names
to satisfy user demand. Neteka's technology allow the use of
UTF8 as well as any other local encoding schemes (Big5, GB, JIS,
KUC, etc.) to be resolved at the server without requiring any
client side reconfiguration or plug-in.
- Mid-term:
ASCII Conversion approach - a server end ASCII conversion approach
is best used as a consolidation strategy for the different IDN
solutions. It offers a common platform for the convergence of
the technologies and provides a smooth transition and migration
from the existing system (including with brute force multilingual),
to a longer-term solution. Neteka's technology utilizes both
RACE and TRACE as a platform for administration for multilingual
names in brute force format, ASCII converted format as well as
protocol extension (mode bit flagging) format.
- Long-term:
Protocol Extension approach - for a longer-term solution, a protocol
extension mechanism is generally considered the best approach
because it eliminates ambiguity by clearly identifying multilingual
names and does not compromise the efficiency of the domain system.
Neteka's technology employs the DNSII bit flagging approach as
well as the EDNS approach to transport and identify multilingual
requests. The DNSII approach also allows the tagging of the encoding
of the requested string, making it more precise and effective.
With all three approaches pre-installed
into Neteka's NeDNS, it is immediately deployable as a server
end solution for registries to offer multilingual names, and
prepared for the migration towards a longer-term solution, whatever
stream it might turn out to be.
|
| Register.com |
Two general types of technologies attempt
to enable the use of non-Latin scripts as domain names. The first
of these approaches involves the transmission of native encodings
as part of the DNS labels used within queries and/or responses.
Native encodings are character encodings, such as ISO-foo or
Shift-JIS used to represent non-Latin scripts. These encodings
are generally 8-bit and always involve the use of characters
outside those permitted within DNS labels by RFC 1034 [RFC1034].
The second type of approach is the conversion
of IDNs into a domain name that conforms with RFC 1034. These
approaches involve the use of ASCII-compatible encodings (ACEs)
of non-Latin scripts. Generally, ACE-based proposals involve
both the compression of non-ASCII data as well as a transformation
into an RFC 1034 compliant string.
For both of these approaches, various parties
have proposed a variety of specific implementations. Internet
Drafts currently describe several ACEs, as well as various approaches
that describe the use of ACEs within various parts of the DNS.
Different approaches using 8-bit character transmissions within
the DNS have also been described, including on-the-wire transmission
of native encodings as well as common formats such as UTF-8.
Finally, some more radical approaches,
such as the creation of a new DNS class or the use of a new directory
layer to replace traditional DNS functionality, have been suggested.
|
| JPNIC |
We follow IETF
IDN WG discussion. Application solution complies the IDNA architecture,
with NAMEPREP and ACE. |
| TWNIC |
(1) Interim case:
a. Using NAMEPREP to convert IDN into English
domain name (ACE encoding) for IDN resolving.
b. Setting up DNS (web) proxy to support
IDN resolving. The DNS (web) proxy convert IDN into English domain
name.
c. Supporting various zone file encoding
in server side.
(2) Test bed case:
a. Modifying BIND software to support clean
8 bits (native encoding) and UTF-8 encoding.
b. Modifying related software: Apache,
Squid, etc., to support clean 8 bits (native encoding) and UTF-8
encoding.
|
2. What are the
strengths and weaknesses of the technologies referenced in Question
1? Please give concrete examples.
| WALID |
A number of the proposed approaches described
above treat the problem of IDN as if it were a 'DNS protocol
problem', instead of a 'domain name problem'. That is to say,
if the DNS protocol or infrastructure can be changed to support
non-Latin scripts, then the problem would be solved. The rough
consensus of the technical community, however, is that this approach
is fundamentally incorrect, and perhaps the approach stems from
a view that the only applications running on the Internet that
need to be considered are web browsers and web servers (and sometimes
only particular web browsers or servers).
WALID would suggest that any approach to
IDNs must take into account the entire deployed base of protocols,
applications, and implementations that run on the Internet today,
many of which are crucial for ensuring the stability, security,
and operation of the network. Many of these protocols and implementations
will not support characters outside of the LDH (letters, digits,
and hyphen) set, either in forward or reverse resolution contexts.
These fundamental issues aside, approaches
that focus on changes to the infrastructure, either by deploying
a new protocol, new servers, or new types of server applications
face the inertia associated with the deployed nameservice infrastructure.
The DNS is everywhere, and attempting to make significant changes
to the DNS as a whole would likely take at least a decade for
complete deployment, risking the creation of islands of dis-connectivity
in the process. Infrastructure-based approaches also suffer from
the problem that updates are difficult to deploy. In this respect,
one need only consider the large numbers of very old BIND distributions
still in operation with serious known security vulnerabilities.
The approaches above that would send Unicode
data directly (typically in the UTF-8 encoding) also ignore the
issues relating to name equivalence, and ultimately would create
a serious security problem, given that many applications and
protocols rely on the DNS for performing authentication and authorization
checks. Many Unicode codepoint sequences, which are visually
identical, can be different at a binary level, creating the opportunity
for a malicious user to fool someone into connecting to a different
host than the one they think they are connecting to. At a more
basic level, without some sort of canonicalization step during
resolution, many users will have a difficult time making IDNs
work reliably. Within the IETF, this requirement has been called
the 'business card test'.
WALID's approach to IDNs, currently in
use as part of the VeriSign GRS multilingual testbed, is to perform
a canonicalization and transformation process of the IDNs on
the end user's system. IDNs are normalized to address the equivalence
problem described above, and encoded using a transformation algorithm
from Unicode into the subset of ASCII permitted in DNS host labels.
IDNs that are presented to the DNS for resolution thus use the
same range of characters as standard domain names. The significant
advantage to this approach is that no changes are made to the
deployed base of infrastructure systems, and the operational
stability of the network is not compromised. Our experience in
working with ccTLD registries has shown that infrastructure-based
approaches, by contrast, are quite unworkable, both because of
the inertia associated with the DNS resolution infrastructure
and the large numbers of web proxy servers that are on the network.
To deploy the ACE-based approach completely,
applications which process DNS hostnames will need to be upgraded
to handle IDNs. In the short-term, a mechanism must be widely
deployed to enable immediate resolution of IDNs in the applications
that end-users use most often, such as web browsers and e-mail
applications. WALID is addressing these needs by making freely
available for download its WALID WorldConnect technology,
to enable immediate resolution of IDNs, and its WALID WorldApp
to enable application developers to incorporate standard IDN
transformation capabilities into their applications.
|
| Verisign |
Methods one and two above involve an application
sending binary (i.e., non-ASCII) data through an infrastructure
not designed to handle it, which certainly has the potential
to cause problems. Application protocols, such as SMTP, call
for domain names to be encoded in ASCII. Not all DNS resolvers
and name servers are "8-bit clean" (i.e., able to handle
binary data without issues). The deployed base is huge, with
endless combinations of components, and it is impossible to test
every scenario for its ability to handle binary data. We do not
know of any completed studies, although MINC is planning such
testing. (Please see http://www.minc.org/WG/testing/interop/.)
For this reason, the IETF Internationalized
Domain Name (IDN) Working Group has focused on the Internationalized
Domain Names in Applications (IDNA) solution, which involves
transforming internationalized domain names (as described in
method three above) at the application level, so that they can
be sent in application protocols and through the Internet's DNS
infrastructure in a known safe format.
|
| Neteka |
Brute Force Approach - The advantage of this approach is that multilingual
names could be deployed immediately at the server end to parse
the multilingual name information and be reachable by a good
percentage of users over the Internet. However there are with
it also a considerable number of disadvantages that could cause
inconsistent responses. These include character encoding conflicts
as well as proxy and application blockages. Character encoding
conflict is one that is particular prominent. The same encoding
value could represent entirely different characters if a different
encoding scheme is used. Conversely as well, the same characters
might be represented with different encoding values under different
schemes. Both of these issues lead to problems for the DNS where
names must be unique and that the user expects to be transported
to the same domain regardless of their input mechanism.
Protocol Extension Approach - The common advantage of using a protocol approach
is that the efficiency of the DNS is not compromised at all and
that there will be no ambiguity as to the exact characters a
domain name query is referring to. Also, with the introduction
of an extension, versioning and future extensions could also
be built in. In essence, a protocol extension approach is generally
considered a better long-term solution for multilingual domain
names. The common disadvantage of the protocol approach however
is that it requires changes and upgrades from both the server-end
as well as the client/application-end. This may result in the
slower adoption of the system.
ASCII Conversion Approach - The most prominent advantage for using an ASCII
conversion scheme is that no changes is necessary in the server
end because they will continue to expect and react to request
that are formatted within the specifications of the original
DNS standards. Conversely, the major disadvantage is that users
that wish to use multilingual domain names must consciously upgrade
their software to be able to reach the multilingual domains.
The average user however is not likely to be technically sophisticated
and would expect multilingual domain names to function the same
as English only ones. Also, it introduces an additional procedure
in domain resolution and takes away the feature of the DNS to
keep the transportation format consistent with the presentation
format of domain names.
Hybrid Solution -
It makes most economical sense for implementers to tackle the
issue with an all inclusive hybrid approach because the efforts
in development of the solution will not become totally emaciated.
On one hand, the inclusion of a brute force approach ensures
that once multilingual domains are deployed, a good number of
users could immediately be able to access and utilize these names.
On the other hand, more alert or early adopters would likely
embrace the ACE technology and already have converters installed,
therefore, to take care of these requests, the database should
include an ACE formatted record. Finally, the system should be
made aware of eventual protocol approach where the incoming packet
would effectively announce the exact encoding scheme and format
of the multilingual name. By developing a three-fold strategy,
the implementer may be able to assure that it will be prepared
for any situation that might transpire out of the dynamic standardization
process now underway.
As the Internet matures, it should no longer
be a purely technology push mechanism for implementing new features,
but should also consider the customer pull factor. In the hybrid
deployment model, first the brute force approach is used so that
registries could begin allowing registrants to obtain functional
multilingual domains and use them immediately without any client-end
reconfiguration. Only the registry name servers and the registrant's
hosting server needs to be upgraded. As the need for accessing
multilingual domains increase, users will be more aware and knowledgeable
of using the ACE approach, which will make provides a good consolidation
towards a common protocol and makes administration much easier.
Eventually, this would encourage middleware and other applications
to upgrade to the protocol approach to make the entire process
much more efficient and truly multilingual aware.
|
| Register.com |
The advantage of 8-bit character transmission
is that these approaches seem to be the most simple and elegant
solutions. These approaches allow fairly direct representations
of IDNs and may allow DNS data to be human-readable for those
with terminals capable of recognizing and displaying the relevant
character encodings. Unfortunately, although the DNS protocol
itself allows for the transmission of 8-bit domain name data,
many of the application protocols that rely on the DNS were not
designed to handle such domain name data, and these protocols
would likely need to be individually re-designed in order to
provide IDN capability.
ACE-based approaches generally provide
a high degree of compatibility, because they continue to use
RFC 1034 compliant labels to represent all DNS data. Some ACE-based
approaches have been designed which move all IDN work to the
application or local resolver, and as a result require no modification
of the name servers which are currently running. Such an approach
allows individual users to essentially "opt-in" to
the use of IDNs by installing updated software on their computers
without impacting other users or affecting the stability of the
network at large.
The more radical approaches mentioned above
offer the potential for significant elegance and potentially
large amounts of innovation, but the time to implement such solutions
is likely to be unacceptably long.
|
| JPNIC |
Strengths: It
can be realized with current protocols.
Weakness: It reduces the string size of each label. It requires
character set / encoding conversion. |
| TWNIC |
Test bed case has the following strengths
and weaknesses.
Strengths:
(1) It does not need to download client
software.
(2) It does not need to proceed ACE conversion
on client side.
(3) The zone file is readable for administrator.
It is easy to maintain zone file.
Weaknesses:
(1) Some Chinese characters contain '\'
'@' codes that makes Internet application confused.
(2) Applications (like Bind, apache, firewall...etc)
do not support clean 8 bits DNS data.
|
3. Are there
more problems relating to particular scripts? Why?
| WALID |
The IETF IDN
Working Group and the Unicode Consortium have been investigating
the complexities associated with introducing non-Latin scripts
into the context of DNS hostnames, and attempting to ensure that
end-user expectations are met. We fully support the work of these
two expert organizations in this area. |
| Verisign |
Experts in these
languages and scripts are in the best position to answer this
question. |
| Neteka |
In general, scripts
with more local encoding schemes are more problematic initially
for quick deployment of multilingual domain names. Other language
issues are local script dependent. For example, there is the
traditional and simplified Chinese issue. Part of the debate
is whether a folding or mapping should occur automatically and
built into the IDN protocol. This coupled with conflicting local
character encoding schemes also makes the deployment of Chinese,
Japanese and Korean scripts more difficult. Neteka's perspective
on the Chinese character folding issue is that it should be a
policy matter and controlled during registration and be dependent
on the registry policy. ICANN should however provide guidelines
as to what the issues are and suggest a number of alternatives
to solve the problem. Other languages also have their own language
issues such as Arabic, where spaces within phrases changes the
meaning and the form of a character, Hebrew where characters
could be omitted, etc. |
| Register.com |
Essentially,
the more different scripts are from traditional Latin scripts,
the more likely problems are to occur. Languages such as Chinese
and others that use the Han ideographs can be problematic due
to the sheer size of the character repertoire. Some languages
have a large number of encodings to represent essentially the
same character set, which can make it problematic to identify
and transform raw data into a common, universally understood
format. |
| JPNIC |
ACE requires
Unicode as its base character set, but many PCs use local character
set such as JIS. It causes normalization problems due to character
set conversion, that is 1 to N mapping. |
| TWNIC |
The second byte
of Big5 encoding characters include ASCII encoding range, it
may make DNS response error data (DNS software is case sensitive
in ascii character) |
4. To the extent
there are weaknesses in the technologies, what groups are working
to develop solutions?
| WALID |
The group most active in addressing the
need for technical standards to support IDNs is the Internet
Engineering Task Force (IETF). The IETF IDN Working Group has
made considerable progress in the past year in defining an overall
set of technical and operational requirements for internationalized
domain names, has vetted a broad set of technical proposals,
and has chosen an approach consistent with those requirements.
Many IETF participants are also active in the Unicode Consortium,
the W3C, and other standards bodies, and the IETF IDN Working
Group and IDN community as a whole benefits from their experience,
coordination, and support.
The Multilingual Internet Names Consortium
(MINC) has also been active in developing policy in the IDNs
area, as well as providing a forum for performing interoperability
testing. While this has been somewhat less successful, MINC provides
a good forum for representing the interests and needs of its
broad constituency and can support the efforts of the IETF and
other technical standards bodies. As MINC moves forward with
its mandate, we expect to see MINC play an important role in
supporting the deploying of internationalized domain names and
in promoting cooperation, compliance, and interoperability between
the systems that are deployed today.
|
| Verisign |
We share the
opinion of others in the IETF IDN Working Group that the issue
that should be tackled is internationalizing domain names, not
internationalizing the DNS protocol. Thus the issue is broader
than some "quick fixes" or partial solutions advocated
by some technology providers, such as simply upgrading DNS clients
and servers. Any complete IDN solution must involve end-user
applications, such as web browsers, as well. The IDN Working
Group is developing standards for IDN and is, we believe, the
primary focal point for a complete solution. |
| Neteka |
Neteka's DNSII (www.dnsii.org) and OpenIDN
(www.openidn.org) initiatives encourage and allow more people
to be involved in this important transition on one of the core
technologies of the Internet. DNSII is a forum for discussing
different multilingual approaches and currently archives Neteka's
proposals. OpenIDN is an open source multilingual DNS, allowing
interested parties to tryout using multilingual names as well
as the source code to enhance the features on their own.
IETF is mainly concerned with the protocol
and tries to determine which approach to use and what the eventual
format should look like.
MINC is a quasi-iDNS initiative started
by iDNS advocates. The discussion includes both protocol issues
as well as language or policy issues. In Neteka's perspective,
both these functions are already carried out by IETF and ICANN
and the responsibility should really go back to these two bodies
for a more comprehensive points of views of the problems therefore
providing better results.
|
| Register.com |
The IETF continues
to work on a variety of issues surrounding the IDN problem space. |
| JPNIC |
JPNIC IDN Taskforce,
JP-CN-KR-TW NIC's Joint Engineering Team and IETF IDN WG. |
| TWNIC |
TWNIC Chinese
technology task force, CDNC, JET, IETF.. etc. |
5. What are the
different solutions under consideration? Which are the most promising?
How much longer will it take to develop a solution that works?
| WALID |
The current proposal before the IETF IDN
working group is "Internationalizing Domain Names in Applications
(IDNA)." From a technical standpoint, we understand that
the WG has established rough consensus around the core concepts
of normalization and transformation taking place within the application.
Assuming that certain non-technical issues are resolved, the
IETF could have a standard ready by the end of 2001.
The consensus in the IETF IDN working group
is not complete, however, and some have suggested that the working
group is failing to consider questions relating to language and
language use, and the expectations of end-users of the DNS. While
these questions are certainly important, we are not convinced
that they concern issues that can or should be solved by the
DNS. Many participants feel that these questions are outside
of the scope of the charter of the IETF IDN working group, which
is focused on enabling use of non-Latin scripts in the DNS, and
thus should be addressed separately.
|
| Verisign |
The work of the
IDN Working Group is public; more information is available at
http://www.i-d-n.net/. The most promising proposal is called
IDNA (Internationalized Domain Names in Applications), which
calls for applications to convert IDNs to an ASCII-only "safe"
format using an ACE (ASCII Compatible Encoding). More details
are available in the IDNA Internet-Draft at http://www.i-d-n.net/draft/draft-ietf-idn-idna-01.txt. |
| Neteka |
IETF - while a good number of proposals
have been presented to the IETF, until recently, discussions
surround the IDNA (IDN Applications) approach. This however collides
with a patent issued to Walid. Recent discussions have included
ways to work around the patent as well as hybrid approaches.
Neteka - Neteka is a proponent of a hybrid
approach to ensure that the migration is transparent to the end
user and smooth for the operators. We believe this is the most
promising approach in that it already works for the majority
of the people on the Internet immediately. It also provides a
clear path towards the longer-term approach where the entire
Internet will become fully multilingual aware. Neteka's system
is also compatible with email addressing systems and Neteka already
have the technology also to introduce multilingual email addresses.
iDNS - the iDNS Proxy solution assumes
that multilingual domain names are redirected to the iDNS servers
for resolution. This creates a bottleneck for the system and
introduces unnecessary complications.
WorldNames - as far as Neteka's understanding,
WorldNames' NUBIND, currently implemented at the dotNU registry,
is essentially a redirector technology and multilingual names
registered using this system could not be utilized for email
addresses.
|
| Register.com |
As indicated
above, there are a number of solutions currently under consideration.
Currently, the [IDNA] solution proposed within the IETF's IDN
working group seems extremely promising; recently, however, intellectual
property concerns have slowed the development of that particular
approach. More generally, ACE-based solutions seem to generally
have the greatest traction and operational experience to date,
and the advantages that they yield in backwards compatibility
is probably a strong argument in their favor. A final solution
to this problem space still seems to be at least six months away. |
| JPNIC |
None other than
the above is thought of. |
| TWNIC |
(1) UNAME:
http://www.ietf.org/internet-drafts/draft-ietf-idn-uname-00.txt
Common Name Resolution Protocol + DNS solution:http://www.ietf.org/internet-drafts/draft-ietf-cnrp-09.txt
(2) Depending on when IETF finalize the
RFC, after that, it would take 1 or 2 years.
|
6. Currently
there are no accepted standards for IDN. Is this because there
are competing technologies, or because the underlying problem
is sufficiently difficult that a "best" solution has
not yet emerged?
| WALID |
WALID believes
that competition is a healthy and necessary part of the development
of any emerging industry, and a useful tool for providing real-world
experience concerning the viability of various approaches to
solving a given technical problem. The 'IDN Subject' is certainly
a complex one, and some have characterized it as one of the most
difficult challenges that the Internet technical standards community
has faced. The IETF and other standards bodies have made extremely
good progress in addressing it within a relatively short time. |
| Verisign |
The IETF IDN
Working Group is moving relatively quickly to produce an IDN
standard. |
| Neteka |
While competing technologies imply that
there is no defacto standard, it is because some initial attempts
are not satisfactory that competing technologies arise. This
is therefore a multifold issue: first of all there is a differing
opinion on what the "best" solution should be. The
underlying problem is sufficiently difficult in that there has
to be compromises and a decision could only be made based on
giving more consideration to some key issues and focusing less
on others. Unfortunately, it is very difficult to build a consensus
on which among the many issues should these "key issues"
be. There are really three main camps:
a. System administrators/operators - this
group generally has the view that the allocation of pain should
be on the user and that it is absolutely important that the servers
are not threatened by multilingual requests even though they
might not break down. They also view that server-side migration
would be lengthy.
b. End users - there are two groups within
this sector: the registrants and the users of domain names. Both
of these groups are eager to have functional multilingual domain
names without requiring client reconfiguration. They expect multilingual
names to work exactly like English only names and will be confused
and frustrated if they are not. They are also technically less
sophisticated and may not understand why and what needs to be
done to get multilingual names working. Therefore they also believe
that the allocation of pain should be on the server end where
the technical expertises are.
c. Technologists - these are the design
engineers and architects who believe that a long-term solution
should be made extensible and cater not only the operators but
also the end users. They have the view that eventually both servers
and clients should be upgraded to enable a fully multilingual
Internet. The servers should be first as that is where the technical
expertise are, while the client end would slowly migrate as new
applications are introduced. Meanwhile existing applications
should also be able to access multilingual names.
|
| Register.com |
The real problem
is that there is no ideal solution. All proposed solutions to
date have drawbacks, and it has been difficult to develop consensus
about which of these drawbacks is the most tolerable. The underlying
problem is indeed an extremely difficult one, and even if a "best"
solution has emerged, it will take time and careful study in
order to recognize and adopt it. Also, because of the critical
nature of the DNS to the Internet community, it is important
to develop and in-depth understanding of the pros and cons of
all possible solutions, and to move towards adoption in a manner
that does not jeopardize the stability of the Internet. |
| JPNIC |
IETF IDN WG has
come to consensus as answered to Q1, so the WG is concluding
proposed technologies, and going to process the result on standard
track. The most anxious hurdle of the process is WALID's patent
issue. |
| TWNIC |
Both of them. |
7. Do the existing
"testbeds" and pre-registrations help or hinder the
resolution of the technical issues relating to IDN? In what manner?
Would the testing impact the ongoing operation of the Internet?
| WALID |
Testbeds are
an important mechanism for gathering useful operational experience
in this area, and help to gauge demand and user expectations
for IDNs. Some of the testbed projects underway have been very
careful to not disturb the use of the existing DNS, while others
have not been as focused on the operational stability requirements
of the network. |
| Verisign |
A testbed that supports the IDN standard
development process, such as VeriSign's testbed, is helpful.
For example, the VeriSign GRS IDN testbed has offered technical
feedback to the IDN Working Group on the complexity of the Row
Based ASCII Compatible Encoding (RACE) algorithm (one possible
ACE). Partially based on this feedback, the IDN Working Group
has decided that RACE is not suitable for use in the eventual
IDN standard.
In addition, the VeriSign testbed has been
conducted in a progressive, phased approach. This allows for
the completion of predefined milestones before moving to subsequent
phases and thereby reduces the possibility of creating DNS stability
problems.
It is difficult to imagine how a testbed
could interfere with the operation of the Internet. It is highly
unlikely that even a testbed that uses domain names in a binary
format (unlike VeriSign's testbed) would negatively impact the
Internet's DNS infrastructure (including the root and gTLD name
servers). Because so many applications already send DNS queries
in one binary format or another, the root and gTLD name servers
are already deluged with such queries as part of the normal DNS
resolution process, all with no impact aside from the additional
volume.
|
| Neteka |
Depending on how the domain resolution
strategy is eventually deployed, pre-registrations should not
hurt the introduction of multilingual names. Other so called
functional "testbeds" may hinder the progress, especially
the establishment of alternative namespace beyond that recognized
by ICANN. This is a very serious issue as these "testbeds"
would redirect all multilingual requests to their own alternative
namespace meaning that even if later on the existing namespace
introduces multilingual names, the requests under the "testbed"
system will be routed to the alternative namespace causing confusion.
Pre-registration however is safer as it essentially means that
the multilingual name is only stored in a database and not being
used. Any technical solution could be deployed later for domain
resolution. It also serves to be an indicator of user demand.
Even when users know that these names do not work, a lot of people
are registering for them in the hope that they will be able to
use them soon.
Beyond the "testbeds" and pre-registrations
in fact Neteka views the faulty implementations on the existing
browsers and unnecessary blockages at proxies, cache servers
and firewalls as even larger hindrance to the implementation
of multilingual domain names. Please refer to section A:16 for
more information.
|
| Register.com |
The operational experience gained from
legitimate testbeds can be extremely helpful in moving solutions
from theory into practice. Due to the large number of commercial
interest in play, however, some of these testbeds might be seen
as attempts to force the Internet community to accept certain
technologies despite their appropriateness or quality. Generally,
policy considerations have lagged behind technology in the IDN
space, and as a result there have sometimes been inadequate assurances
that testbeds serve the internet community by providing valuable
operational experience as opposed to benefiting certain commercial
interests at the expense of technology.
Generally, testbeds should not affect the
ongoing operation of the Internet. It is important that end user's
expectations of these testbeds be managed carefully however-these
users may be under the impression that the testbeds may be an
operational portion of the Internet, and may view technical failures
within the testbeds as operational problems rather than a normal
part of the testing process.
|
| JPNIC |
They provides
a lot of 'real samples' to evaluate proposed technologies such
as ACEs, that are useful to list up issues. The impact on the
operation is that DNS or Web server operators must learn how
to convert IDN to ACE. Testing provides good opportunity to learn
it. Testing also provides many information about IDN to end-users,
engineers, developers, and service providers. |
| TWNIC |
Commercial promotion
on a test bed product is not good. It is better to provide service
until the standard of IDN is ready. But if the local testbed
does not influence the Internet stability, it would be help for
IDN development. |
8. Are natural
languages so complex, rich and varied that a true IDN system
that responds completely to user expectations is beyond current
technological capability? Can the problem be solved incrementally
in a manner that does not interfere with the operation of the
entire domain name system?
| WALID |
The IDN problem
in our view is not one of natural language, but rather one of
adding support for a wider range of scripts to be used as identifiers
in the DNS. As such, issues involving natural language and the
often context-sensitive expectations of users are outside of
the scope of the IDN-related efforts currently underway. Some
within the community have proposed creating directory service
layers above the DNS to meet the expanding needs, and we strongly
encourage and support any work in this direction. Natural language
issues are language- and locale-specific, and any proposals to
address them should be developed based on participation by native
members of the locale as well as general linguistics expertise. |
| Verisign |
The IDN Working Group is already developing
a technical solution to support a true IDN system.
As noted above, we support the introduction
of IDNs in a phased manner that does not risk interference with
the operation of the DNS.
|
| Neteka |
This depends on the perspective of what
constitutes a "domain name". Some technical persons
maintain that a domain name is nothing more than a string of
characters for the identification of a resource over the Internet.
Neteka however believes that domain names have evolved from its
origins to represent an identity of a person or a corporation
on the Internet, whether it is being used as part of an email
address or simply a web address. Natural language rules can definitely
be introduced to the DNS, Neteka's technology have shown that
the use of phrases, punctuations and even spaces are possible.
Therefore a fully natural language domain name is possible.
However, it is important to also understand
that the domain name system is useful because of unique names
and this rule should not be violated or confusion would occur.
The same phrase must result in the same resource regardless of
which locale or platform it is accessed from. This means that
certain user education is required to understand that Mikeshoes.tld
may not be the same Mikeshoes in your local mall.
|
| Register.com |
The domain name
system was never intended to serve as a directory service with
the capability to consistently find the appropriate result to
a natural language query. Although the original design of the
DNS includes certain characteristics which are designed to reduce
language-related errors (for example, case folding, or even the
original limitation of domain names to use only ASCII characters),
it still is not capable of distinguishing between variants of
words (e.g. "color" versus "colour") or appreciating
the other subtle nuances of language. Regardless of the IDN solution
that eventually emerges, it will be important to educate users
regarding the use of the Internet. A good IDN solution will not
solve natural language problems, but will allow many more users
to take advantage of the Internet using their native language
and their native character sets. |
| JPNIC |
We believe that
IDN doesn't introduce 'languages' to DNS, but introduces non-alphanumerical
scripts or characters. |
| TWNIC |
Usually natural
languages will not be a domain name, user may use natural languages
on search engine to find out some data. But proper normalization
of DNS is required even it is very difficult. |
9. How do different
technologies affect the size limitation of domain names? What,
if any, are the possible solutions?
| WALID |
Domain name segments are limited to 63
octets per segment, and an overall domain name length of 255
octets. In the context of the ACE-based proposals, Unicode codepoints
can expand to multiple octets, thus reducing the number of actual
non-Latin characters that can be used in a domain name. Even
in non-ACE proposals (particularly those that rely on UTF-8)
this same issue exists.
There are a number of proposals under consideration
by the IETF IDN working group to address this issue through efficient
encoding of Unicode sequences. The challenge in this area is
to find an encoding algorithm that is both very efficient yet
relatively simple to describe and implement. The current draft
before the IETF IDN Working Group ACE design team comes very
close to meeting these requirements.
|
| Verisign |
Domain names
are limited to 255 octets in length and individual labels (i.e.,
between periods) are limited to 63 octets. This is a fundamental
limitation of the DNS protocol and cannot be changed without
altering the DNS protocol. Different representations of different
character sets require more or fewer octets depending on their
design. For example, UTF-8 is a variable length encoding of the
Unicode character set. In a given number of octets, some scripts
require more space than others. The IDN Working Group has been
sensitive to this issue during the design of the various ACE
algorithms that are candidates for inclusion in the final IDN
standard. A requirement of the final ACE algorithm is a roughly
equal treatment of all scripts in Unicode. |
| Neteka |
Brute Force Approach - utilizes existing packet format therefore will
only allow a maximum of 63 bytes. Depending on the byte length
of the character encoding scheme used, the number of characters
possible could range from 63 to 15.
Protocol Extension Approach - new size limit could be introduced so length can
become a non-issue.
ASCII Conversion Approach - utilizes existing packet format. Depending on compression
scheme, domain length per label ranges between 15 - 20 characters.
|
| Register.com |
Because they
transform eight bit characters into what is approximately a five
bit (37 possible values) storage format, ACE-based solutions
generally reduce the number of native characters that may be
present within a single DNS label. Most of the existing ACE proposals
contain compression mechanisms in order to increase the size
of the native domain name as much as possible. |
| JPNIC |
As answered in
Q2, ACE reduces the size of each label. Therefore ACE must involve
effective compression algorithm. JPNIC is evaluating many ACEs
and contributing to IDN WG ACE team. |
| TWNIC |
There is more
length limitation on ACE encoding. Native encoding (local encoding
like big5) has less length limitation on domain names. |
10. Do IDNs
pose special problems for the technical operation of WHOIS databases?
If so, what problems? What are the possible solutions?
| WALID |
Access to the
WHOIS public registration databases tends to be provided in two
ways: via web-based interfaces, and through the TCP port 43 whois/nicname
service. One of the challenges for operating a WHOIS database
will be in ensuring that queries arrive in a form that can be
accurately matched against the database contents. WALID considers
that a positive solution would be to use the IDNA approach and
upgrade the deployed 'WHOIS' clients. These upgraded applications
would need to normalize and transcode IDNs into their ACE equivalents,
and then use the transformed name as the query to the WHOIS server.
This is a strength of the IDNA approach, in that it addresses
not only the question of IDNs in the DNS, but also in all of
the applications, such as WHOIS, which use domain names as application
protocol elements. |
| Verisign |
No, although WHOIS services must be internationalized
if the domain names they hold are internationalized. One possibility
is internationalizing the WHOIS protocol itself, along with clients
and servers. Another is adopting the IDNA approach: IDNs would
be stored in an ACE format and WHOIS clients would be required
to convert internationalized user input into ACE format before
querying a WHOIS server.
VeriSign GRS is presently developing an
IDN Whois service. In the interim, an IDN conversion tool is
provided.
|
| Neteka |
Multilingual
domain names should not present special problems not encountered
by the domain name server. Depending on the approach used, WHOIS
databases may need to be upgraded however for it to handle multilingual
requests. For example, if a protocol extension approach is used,
the WHOIS side should determine whether the mode bit is required
or should it force all request into a standardized format. |
| Register.com |
Generally, IDN
problems should not significantly affect the operation of the
WHOIS database. It may be necessary to display WHOIS data in
non-Latin scripts, but this problem can largely be viewed independently
of the IDN effort. |
| JPNIC |
The problems
of WHOIS are expressions in query and display. Short term solution
is to update IDN-aware whois client. Long term solution is to
improve WHOIS protocol. |
| TWNIC |
Some WHOIS database
can not accept clean 8 bit data or query. The problem could be
solved if IETF finalize the standard for WHOIS databases support
IDN a soon as possible. |
11. Are any
IDNs related technologies covered by patents or other intellectual
property rights? If so, will this have an affect on the implementation
of IDNs?
| WALID |
We understand that there are a number of
granted patents and patent applications that cover various areas
relating to internationalized domain names, including U.S. Patent
No. 6,182,148, which was issued to WALID, Inc. on January 30,
2001, a related PCT application by WALID, and at least one pending
patent application by i-DNS.Net. We consider that intellectual
property rights need not impede implementation of IDNs, and may
even encourage a more rapid adoption of a single and optimal
technical standard.
Regarding WALID's patent and PCT application,
we have supplied the following IPR Statement to the IETF on November
3rd, 2000. We understand that this statement is in accordance
with many such statements that have been filed with the IETF
by numerous companies in the past:
Pursuant to the requirements of RFC 2026,
Section 10 ("INTELLECTUAL PROPERTY RIGHTS"), WALID,
Inc. ("WALID") gives notification to the IETF Secretariat
that one or more patent applications relating to a METHOD AND
SYSTEM FOR INTERNATIONALIZING DOMAIN NAMES have been filed. Should
the implementation and practice of any part of an IETF standard
relating to the above subject matter require the use of technology
disclosed in any granted WALID patent, WALID is prepared to make
available, upon written request, a non-exclusive license under
reasonable and non-discriminatory terms and conditions, based
on the principle of reciprocity, consistent with established
practice.
For any questions regarding WALID intellectual
property and license, please contact:
J. Douglas Hawkins
WALID, Inc.
State Technology Park
2245 S. State St.
Ann Arbor, MI 48104
|
| Verisign |
Several companies
have patents surrounding the IDN space. WALID, Inc. has notified
the IETF of a patent that may cover the work of the IDN Working
Group. The working group is currently taking this patent into
account as it decides whether or not to proceed with the IDNA
solution. |
| Neteka |
Neteka understands that there are at least
the follwing three patented approaches:
Neteka - Parts of Neteka's multilingual
technologies are patent pending and are submitted as Internet
drafts to the IETF and archived both at the IETF site as well
as at http://www.DNSII.org. Neteka's technology however is available
as open source and is freely available at http://www.OpenIDN.org.
This ensures that even if Neteka's technology is used, the Internet
community is guaranteed to have a freely available source of
the technology for their utilization.
Walid - In essence, Walid's technology
is a client-side or pre-DNS-server ASCII conversion approach.
Neteka's understanding is that the patent surrounds a technology
that intercepts multilingual requests sent from the client and
performs a conversion of the multilingual characters into an
alphanumeric form acceptable by the existing DNS and reformulating
the request to carry this alphanumeric string before sending
to existing DNS servers for domain resolution. Servers therefore
do not need to be upgraded as requests remain in ASCII format.
iDNS - As far as Neteka's knowledge, iDNS
utilizes a proxy solution that performs similar interception
of multilingual domain names as prescribed by Walid. However,
the conversion and detection is done in a proxy server beside
the domain name system. All requests must first go through this
proxy before going thorough a DNS resolution process.
|
| Register.com |
Several companies
claim intellectual property rights over various portions of the
IDN solution space. These claims could affect the implementation
of IDN if groups such as the IETF make decisions regarding whether
or not to use a technology based on its IPR encumbrances, or
if the holder of intellectual property rights regarding a particular
solution seeks to prevent others from using the technology. |
| JPNIC |
JPNIC doesn't
have any patent to IDN related technologies. |
| TWNIC |
ACE covered by
Walid's patent is a obvious example. It will has an affect on
the implementation of IDNs, but TWNIC do not use ACE solution
at current stage. |
12. Are you
participating (or have you participated in) the IETF standards
process for IDN?
| WALID |
WALID has been
an active participant in the IETF IDN Working Group, and has
submitted Internet-Drafts supporting the Working Group's efforts.
However, in conformity with RFC 2026 Section 10, WALID has not
proposed any of its proprietary technology to the IETF for inclusion
in a standard, and WALID participants in the IETF were vigilant
to avoid making any contribution related to our patent application
to the IDN Working Group before filing our IPR Statement on November
3, 2000. |
| Verisign |
Verisign GRS
is an active participant in the IETF standards process, including
the IDN working group. |
| Neteka |
Yes, Neteka is
actively participating at the IETF IDN work group and have submitted
three Internet drafts as proposed solutions for multilingual
domain names. These are also archived at the DNSII site. |
| Register.com |
We are participants
within the IETF standards process for IDN. |
| JPNIC |
Yes, we are.
We are participating in IETF IDN WG from the very beginning of
it. |
| TWNIC |
Yes, we attend
IETF IDN WG meeting several times and there is a IETF IDN WG
status update on JET meeting every time. |
13. Once IETF
adopts an IDN standard, how quickly will it be incorporated into
applications such as browsers? Are any problems with this incorporation
anticipated? What can the IETF and ICANN do to facilitate the
incorporation process?
| WALID |
Should an ACE-based approach to IDNs be
chosen by the IETF and accepted by the Internet community, we
would expect that major application suites could be upgraded
within a few months of the adoption of the standard. In order
to ensure rapid adoption, ICANN could move swiftly to endorse
and support the standard with a policy focused on encouraging
consensus and interoperability in this area. In the short-term,
end-users are going to demand enabling software to resolve IDNs
immediately. ICANN can reduce the potential for fragmentation
during the period before the final standard is issued by encouraging
the distribution and adoption of these enabling technologies.
If a non-ACE-based solution were to be
chosen, we would expect to see a much slower deployment and adoption
cycle. Many experts within the IETF believe that an infrastructure-based
solution could take as long as eight to ten years to fully deploy,
and we would expect to see a significant amount of fragmentation
and non-interoperability in the area of IDNs as a result.
|
| Verisign |
Only application
developers can answer the first two parts of this question. The
IETF can facilitate the process by developing an IDN standard
in a timely manner. ICANN can facilitate the process by supporting
the IETF's efforts and the eventual standard. |
| Neteka |
The speed of adoption will be dependent
on the solution chosen and intellectual property rights (IPR)
issues surrounding it. Existing browsers have already implemented
some measures for multilingual domain names albeit often faulty
and problematic, it is therefore likely that a patent protected
approach might not be embraced by the browser community.
Furthermore, Neteka believes that regardless
of the standard adopted, there needs to be a transition period
and registries will have to embrace a solution for them to be
able to immediately deploy multilingual domains that can be used
by most of the people on the Internet. This would very likely
mean a hybrid solution more or less like that described in section
A:1.
|
| Register.com |
The speed at
which IDN is adopted into applications may depend on the particular
IDN solution that is adopted by the IETF. Some approaches are
easier than others to implement at the application layer, and
as a result would likely see faster uptake by application developers. |
| JPNIC |
Deployment of
IDN-aware applications heavily depends on two things: IDN toolkit
and definition of IDN in application protocol. When toolkit is
prepared, applications such as telnet of ftp that treat hostname
will be easily developed. But applications such as browser or
mailer that treat domain name in application protocol won't.
IETF or other organization such as W3C should define how IDN
is treated in application protocol. ICANN should elaborate criteria
whether each accredited registry properly adopts IDN technology.
Also ICANN should support fundamental software budget such as
BIND. |
| TWNIC |
(1)It's perhaps
within one or two years.
(2)Once if IETF finalize IDN standard, as soon as possible, the
vender will adopt it. |
14. Will the
IETF standard be interoperable with other IDN standards? What
can be done to eliminate interoperability problems (assuming
not all ccTLDs adopt the IETF standard)?
| WALID |
Given the diverse
range of approaches currently deployed to support IDNs, it is
impossible for the IETF to issue a standard that provides for
complete interoperability with all existing deployments, nor
is such an expectation reasonable. Adoption of any technical
standard is of course voluntary, and we would expect user and
market demands to promote standardization and uniformity in this
area. To ensure interoperability during the transition period,
WALID is adding support to our WorldConnect system enabler to
enable end-users to continue to resolve IDNs that may have been
registered using different standards. With a client-based approach
such as WALID's, it is possible to support de-facto or national
standards in addition to the final standard the IETF recommends. |
| Verisign |
There are no
IDN standards at this time with which an eventual IETF standard
could interoperate. There are various IDN experiments, none of
which can be expected to interoperate with an IDN standard. We
believe compliance with an IETF IDN standard should be a requirement
for all ccTLD and gTLD operators now offering IDNs. |
| Neteka |
Regardless of
the solution embraced by the IETF, Neteka's hybrid solution should
be able to make sure that interoperability would not be a concern.
It is already interoperable with some of the ccTLDs' solution
as well as the IDNA solution currently contemplated by the IETF.
Should a protocol extension approach be adopted, Neteka's solution
is also prepared for it and could consolidate different approaches.
In short, there is not too much interoperability concerns so
long as alternative namespaces and unnecessary name checks are
not created to complicate this problem. |
| Register.com |
Due to the wide
variety of IDN approaches, it is likely that the IETF standard
will not be interoperable with various other IDN approaches.
For this reason, it is extremely important that all interested
parties be active participants within the IETF process and that
registries and registrars do not make irrevocable technology
decisions prior to the adoption of a formal standard. |
| JPNIC |
There is no IDN
standard yet. The IETF will standardize only one, so interoperability
to be concerned locates between IDN and current DN. |
| TWNIC |
I think all the
ccTLDs will follow the IETF standard. It is better dialogue with
IDN users when IETF IDN WG defines the standard. Encouraging
IDN users participate and involve IETF IDN WG would help for
push forming IDN standards. |
15. Are there
other end user needs concerning IDN that need to be addressed?
| WALID |
One question
that has not been discussed sufficiently concerning IDNs is the
use of IDNs in document contexts, such as URLs embedded in HTML
or XML documents. End users are going to expect to be able to
generate URLs containing domain names in native characters, so
the IDNA approach (in its current form) needs to address these
issues before it can be considered complete. |
| Verisign |
This survey appears
to address key user needs. |
| Neteka |
Neteka believes that it is very important
for multilingual domain names to be immediately usable by most
client systems on the Internet today without requiring client
side modifications or plug-ins. This is a very strong demand
from all of Neteka's clients and represents the major concern
for multilingual domain name registrants and users. The average
user is usually not technically sophisticated enough to understand
the complicities of multilingual domain names and will be frustrated
and confused if multilingual names do not work as expected and
the same as English names.
Beyond providing multilingual characters,
symbols and punctuations are also very important as a component
of language. The introduction of multilingual characters open
up the opportunity to introduce some symbols as well and they
should not be excluded.
|
| JPNIC |
Left hand side
of E-mail addresses, path part of URL, electronic signature,
and so on. RFC2825 addresses it clearly. Domain name is a fundamental
component of communication on the Internet. The requirement of
the end-user is not only resolving IDN as hostname, but also
indicating certain entity on the Internet. |
| TWNIC |
Backward compatibility
and general Internet application adoption. |
16. Are there
any other technical issues we should know about?
| Verisign |
This survey appears
to cover the major technical issues. |
| Neteka |
No matter how multilingual names are deployed,
a set of problematic glitches would arise as the transition takes
place and as users learn to understand more about these issues.
The main reason being that the average user will not immediately
understand why they might not be able to access multilingual
domain names using their existing system. These could range from
the client side software settings to the ISP settings or even
the authoritative end hosting handling. A more technically comprehensive
documentation on these known issues could be found at http://www.OpenIDN.org.
Browser & DNS Client Application Issues
- some browsers simply block all entry of domain names, others
try to implement some form of transformation of the name causing
loss of character information, which is sometimes irrecoverable.
There are four main types of behaviors among the browsers and
client side applications when encountering a multilingual domain
name:
- Send as is without interfering - while
it is positive that the request is being sent without faulty
alterations, because character encoding information is not provided,
it is very difficult to determine precisely the intended domain
name;
- Attempt to convert to UTF-8 - most implementations
to date are problematic due to complex application (browser)
and operating system kernel intricacies. In some occasions, the
double conversion occurs (UTF-8 on UTF-8 bytes), others drop
ending bytes, still others perform unnecessary case folding causing
character information loss that may be irrecoverable;
- Attempt to convert to some form of ASCII
string - similar to the UTF-8 issues, these implementations sometimes
creates inconsistent results. Notably is the different behavior
of the application whether it goes through a proxy or not;
- Refuses to send request
DNS Resolver Issues - in general, the DNS
resolver resides at the ISP level. There are three areas of trouble
for multilingual domain names at this level: 1) the ability to
match multilingual requests with cached records; 2) the ability
to refer the request accurately to its nearest match (TLD/root)
server; and 3) the ability to cache the results of multilingual
requests. It is very important that these "messengers"
in the DNS do not choke on multilingual requests. Because the
original DNS protocol itself is 8-bit capable, this middlemen
level usually simply passes requests along the DNS path, however
proxy and caching issues could complicate matters (Section 4).
Authoritative DNS Databases - authoritative
DNS databases include root servers, top-level domain (TLD) registry
servers to individual domain hosts. While they are critical to
the functioning of the Internet, especially for root servers
and TLD servers, their tolerance to multilingual requests are
higher because they seldom perform caching and will implement
multilingual names only when they have prepared for it. Multilingual
requests to root servers will either be authoritatively dropped
because the particular TLD does not exist or will be referred
to existing ASCII TLDs.
Beyond the direct implications of multilingual
domain names on the registration system and the domain resolution
system, a handful of other peripheral issues arise as multilingual
names are being introduced to the Internet:
Proxy Servers & Cache Servers - first
and foremost, proxy servers and cache servers will be affected
because they depend on URLs and domain names to function properly.
They also contribute to the blocking of multilingual names and
thus present a huge barrier for multilingual names to be transparently
deployed. A multilingual aware, patched version of Squid is currently
available from Neteka.
Web Servers & Digital Signatures -
web servers are the next in line that requires some work in order
to be able to perform accurate virtual hosting functionalities
as this as well depend on domain names. Digital signatures and
certificates are also an area of concern as they also uses domain
names as a key identifier. As DNS security is being deployed,
this becomes even more important. A multilingual aware web server
based on Apache is also available from Neteka.
Other Applications & Databases - besides
the immediate critical transportation nodes, other applications
such as databases that hold domain names and email addresses
will have to be taken into considerations. These include customer
databases, mailing lists and other directory, search as well
as storage applications. Neteka's API solution for a quick fix
for these applications is the NeMate library which utilizes an
ASCII transformation engine to force multilingual names into
unique ASCII identifiers without loosing character information.
|
| JPNIC |
IANA should define
ACE prefix (ACE identifier) as soon as possible. JPNIC proposed
a determination process in draft-ietf-idn-aceid-01.txt. |
| TWNIC |
1. Consider modify BIND and Internet application
to support clean 8 bits (native encoding) and UTF-8 encoding
environment, in order to accept IDN.
2. The technology of converse between Traditional
and Simplify Chinese encoding.
|
| Klensin |
The majority of the issues raised here
are either protocol-design (or interpretation) or market behavior
and analysis ones. They are important issues. But, they, especially
the protocol ones, are not going to be settled properly by counting
heads or otherwise determining a majority opinion from the community.
More generally, I believe that this issue
is, with the exception of one area that has, IMO, been persistently
dodged, out of ICANN's scope:
- Design, evaluation, and approval of protocols
falls into different space. Nothing gives ICANN any authority
or responsibility in this area until the point at which parameter
assignment is involved, and ICANN has little discretion about
most parameter assignment issues.
- The IETF process in this area will take
as long as it takes to get things right. There is enough pressure
on the area that I do not believe it is likely to take one week
longer than that. But pressure from various interests, including
ICANN, are unlikely to produce quicker results of high quality
and may impede the final schedule. For example, I had intended
to spend this morning wrapping up the next draft of a set of
documents that lay the foundation for a "search environment"
system clearly enough that we might start thinking about working
groups and area allocations. Instead, I'm attempting to respond
to your "survey" note.
- As most of you know, there has been a
gradual shift in the technical community --driven by increased
understanding of user needs, requirements, and expectations--
away from the belief that a DNS-only solution will be adequate.
The revised opinion is that additional mechanisms, which support
"search"-type operations rather than only the DNS's
exact-match lookups, will surely be needed and that "the
IDN problem" will not be solved or protocol work completed
until they are. I make no prediction as to whether IETF will
agree on a partial/ temporary/ interim in-DNS approach while
those other scenarios work themselves out.
- Any common/standardized approach, whether
layered on the DNS or part of it, that moves outside the traditional
DNS, hostname, and Class=IN rules, is going to raise important
strategic issues for ICANN and the community. There are no approaches
of this type that I consider plausible (e.g., not fragmenting
of the Internet) that do not have at least some aspects of a
"unique root" situation or other way to ensure uniqueness
of names. But any of them --whether a new class, a directory-like
structure, or something else-- imply, technically, the opportunity
to go back and revisit the governance and authority questions
and to do so without any significant claims of US Government
ownership, authority, or oversight responsibility. I would personallyprefer
to see ICANN take on the necessary roles, if only because I don't
want to revisit the battles and traumas of the last four or five
years. But I thnk your ability to gain acceptance in that role
will be significant enhanced if you demonstrate to the community
that you are able to resist efforts to drive you toward expansion
of your role beyond your natural charter. And IDN surveys and
evaluation at this point are expansionist.
- The one area where I believe you clearly
do have scope -by virtue of inheritance of IANA's role under
RFC 1591-- is to protect the Internet against abuses of the DNS
that create the risk of damage to existing, conforming and deployed
software, or of ambiguous or non-unique naming. The risks in
those areas of ill-defined testbeds, "just send 8"
strategies, encouragement of multilingual cybersquatting, etc.,
are considerable and have been identified repeatedly to ICANN.
The solution is to start warning the relevant domains of the
impact, with the potential of starting a redelegation process
--clearly contemplated by 1591-- if they continue to encourage
these efforts. If, as I suspect is the case, ICANN is effectively
powerless to do this, then admit that and get out of this area
until the various issues sort themselves out in the marketplace.
|
| Probst |
1. Naming
I wonder, why you call this "Internationalized
Domains". Is a domain in an American Indian's script an
"international domain", or rather a "multilingual
domain" (MLD)?
2. Verisign's "Testbed"
Versign started its "testbed"
with mixed appreciation of its usefulness. ISOC discouraged it,
but Verisign went ahead, and by indicting that they would transfer
testbed registrations later without additional charge to the
live gTLD zones, they put registrars into a difficult situation:
comply with ISOC's requests and wait with MLD registrations,
or accept MLD registration in order not to loose customers.
Registrants on the other side, as much
as they might have wanted to honour ISOC's request, had to register
their rightful names in the testbed, in order to be sure, not
to loose out, once MLDs are accepted in the live gTLDs, i.e.
existing testbed registrations would be transfered to the live
zones.
For everybody it has to look, like Verisign
is dictating the conditions, not ICANN.
3. Verisign and NSI
Verisign had published a time table when
they would accept registrations for which script. UNICODE was
after a while scheduled for 5th of April. At that time, the Network
Solution webpage for testbed registrations was way outdated.
I think it said UNICODE registrations would be available by early
March, i.e. the page was done early February and had not been
updated until 5th of April. The page was in a language, which
didn't suggest, that NSI was waiting for Verisign, but they themselves
would be ready with their setup until the given times. Without
further explanation, Verisign then delayed UNICODE for the 19th
of April. On that date the page on NSI changed and they accepted
registrations.
One cannot help but wonder, whether Verisign
delayed the process, because NSI was not yet ready, and to start
earlier had meant much lost revenue for NSI (other registrars
were ready already).
I am aware, that this is a vague suspicion,
but in case it would be true, who could proof it?
4. Register.com
Register.com was one of the few registrars,
which accepted "pre-registrations" for UNICODE domains,
even before the 5th of April, claiming, that they would try to
register them, as soon as possible. On 19th - and even until
the 23rd of April, none of the Register.com's pre-registered
domains showed up in whois, and it was even possible to register
them with NSI (again). Some days later, Register.com informed
registrants, that their domains had been accepted and charged
for it. However, until today, those domains show up in whois
only as "registered by Register.com", but don't show
the registrant (whereas the NSI registrations do). This leaves
registrants neither a chance to check "first come, first
serve" principles, nor to fight cybersquatting at an early
stage.
5. Client Applications
As far as I know, none of the current client
applets (to do the foreign script to *ACE conversion) supports
UNICODE. Customers in "UNICODE countries" therefore
cannot participate in Verisign's "Phase 3.2" (current)
and "Phase 3.3" (which should start soon). A "testbed"
where the testing cannot be done is rather useless.
6. Blocking of MLDs
I didn't find any policy stated, what would
happen to domains, which are directly registered in their *ACE
form, before "official" registrations (or the transfer
of testbed domains into the live zones) will occur.
7. Ease of use of Whois.
To check whois info on MLDs (in the testbed)
is right now a cumbersome multistep-procedure: transform the
MLD version via an online tool into its RACE version, then copy
and past this RACE version into a whois form on some other websites.
There need to be tools to make this easier
for non-techies.
8. Open Source
I strongly suggest to adopt only technology
where, and to "go live" when required tools (like those
applets below) are available under an Open Source License, so
that they can be easily adapted to local languages and to different
computing platforms.
9. MLDs and "alternative TLDs".
During the introduction of the MLDs, every
Internet user who wishes to access those MLD domains has to install
a small applet to do the conversion to a DNS compatible *ACE
string. This will make it very easy for companies like New.net
to offer those applets with "double functionality":
new MLDs and at the same time a new root (e.g. the New.net root).
Looking at the latest published numbers from New.net, it seems
to me, that ICANN is on the best way to loose the battle.
If it obviously cannot win on its own (anymore),
then it might make the most sense to look for allies, and the
group around the ORSC/Superroot seems to me the best option.
By peering the ICANN root with their root, there would be immediately
lots of new TLDs available for everybody on the Net (whithout
the need for plug-ins), and New.net with their conflicting TLDs
had to fight against lots of TLD holders. The ORSC looks like
very reasonable, has obviously the most "historical legitimacy",
and seems to be willing to co-operate with ICANN.
|
Comments concerning the layout, construction and
functionality of this site
should be sent to webmaster@icann.org.
Page Updated 29-May-2001
(c) 2001 The Internet
Corporation for Assigned Names and Numbers.
All rights reserved.
|