Supporting Linguistic Diversity of Africa for the Internet’s Top-Level Domain Names
Thirty different languages – that was the result of a quick poll asking the twenty participants attending the IDN Workshop to list the languages they speak. ICANN organized the workshop at the Africa Internet Summit in Nairobi on 28 May 2017. This response exemplifies the enormous linguistic diversity in Africa, where the use of several languages – or multilingualism – is the norm. There are at least 2,144 languages spoken across the continent, with individual countries such as Nigeria having as many as 520 languages. By way of comparison, 287 languages are spoken in Europe.
Historically, Africa is among the places where written communication was established first, with the Egyptian hieroglyphs being among the oldest writing systems discovered. But the majority of the African languages used today are only spoken – without written form. Still, estimates show that more than 500 languages have a written form. Not surprisingly, the diversity of the writing systems created by Africans mirrors the diversity encountered with spoken languages: up to 29 scripts saw their creation in Africa – spanning nearly all known script types, including abjads, abugidas, alphabets, syllabaries, and logo-syllabaries. Of these scripts, 21 may still be in use and new scripts are being created continuously, with some defying current linguistic classifications, such as the colorful Oracle Rainbow Script created as recently as 1999. The more widely used scripts include Tifinagh, for example, an ancient script used since the 3rd century Before Common Era (BCE), which was revitalized in the 20th century and is now used in a standardized form to teach Berber languages such as Amazigh to pupils in primary schools of Morocco. For an example, see the primer in Amazigh developed by the Institut Royal de la Culture Amazighe.
Further examples include the Ethiopic script used for many languages in Ethiopia and Eritrea, the Vai syllabary used for Vai language of Liberia, or N'ko, an alphabet used for a family of languages called Manding in West Africa. Several scripts are now historic and have fallen out of use, while others such as N’ko have viable user communities and can be represented digitally today. However, many scripts lack resources such as fonts or input methods, nor are they officially supported or recognized.
The most widely used scripts of Africa are foreign scripts introduced historically, namely the Arabic script (referred to as Ajami in some language communities) and the Latin script. These scripts have been extended to represent the additional sounds in local languages of Africa. Examples include click sounds used by languages of Southern and Eastern Africa such as lateral clicks (listen to a pronunciation), written with symbols not considered letters in other languages (such as the double pipe ǁ), or by very complex sequences of letters (such as gǁx’ ([ᶢǁʢ] in the International Phonetic Alphabet) in Juǀʼhoansi, a language of Namibia and Botswana. The same has also been done for the Arabic script, with new letters created to represent local sounds such as the prenasalized stop /mb/ or /ᵐbʷ/ (listen to a pronunciation) in Chimiini, a language of Somalia (as there is limited font support for this letter, see U+08B6 encoded by the Unicode standard to view its orthography).
Furthermore, the use of multiple scripts by the same language community – called multiscripturalism – is very common in Africa. For example, two versions of Alphabet National du Tchad (ANT) have been created, one based on Latin script and the other based on Arabic script. Communities using Sar language may write it in either script, for example, the word for lion is written as “ɓəl” in ANT Latin and ٻّلْ in ANT Arabic as shown here.
ICANN is currently undertaking a program to support Internationalized Domain Names (IDNs) as top-level domains (TLDs). It is developing Label Generation Rules for the Root Zone (RZ-LGR) to support the different scripts. This work is led by community-based panels (called Generation Panels, GPs) which document the use of the script based on the procedure finalized by the community. The Arabic script GP has already finalized its work and supports the major African languages that are written in the Arabic script. More recently, the Ethiopic script GP has also finalized its proposal for integration into the RZ-LGR.
Latin script GP has also started its work and is investigating the use of the script in Africa, in addition to other continents. It is challenging to determine how the Latin script has been extended to cater to the African languages as there is limited documentation. Therefore, ICANN has been reaching out to the communities in Africa to get them involved in this effort. ICANN has been holding annual IDN workshops in Africa for this purpose – Congo in 2015, Addis Ababa in 2016, and Nairobi in 2017.
While ICANN has received some expressions of interest, more volunteers are needed from Africa for the Latin GP to advance this important work. Please email IDNProgram@icann.org if you are interested in participating or have any queries.
The RZ-LGR project currently includes Arabic, Ethiopic, and Latin scripts in the context of Africa. ICANN will support other scripts in Africa for IDN TLDs, if they are actively being used by the relevant communities, and if the communities can gather sufficient interest to form GPs and develop proposals for the RZ-LGR.
Please visit www.icann.org/idn for more details about the IDN Program at ICANN.