I believe that multilingual or internationalized domain names (IDNs) are key to a truly global Internet. Therefore, I had been an active member of the Arabic script IDN community, before taking up my current role at ICANN.
In the past, the rules for creating new top-level domain labels like .sg, .info, .org, etc. were simple: labels must (i) only be composed of letters (a-z) in the English alphabet, and (ii) consist of two to 63 characters.
All of that is changing now. The future holds a multilingual Internet, where a user from anywhere across the world can use domain names and navigate entirely in his or her native language, increasing participation and promoting diversity.
And you can play a part, by getting involved for your language and script. ICANN is calling for volunteers to serve on one of several panels that will define the rules for generating new top-level domain labels for the script or writing systems for their community.
The goal of these panels is to support the use of Internationalized Domain Names (IDNs) by determining what is a valid top-level domain label in each script or writing system. This involves answering three questions:
• which subset of characters from the various scripts can be used to form a label
• which of these characters (if any) may be considered confusable or variants by end users
• what are additional constraints on these labels?
This work will define variants of existing IDN top-level domain labels and the validity of any future top-level domain label for the Root Zone.
Label Generation Rule-set
As there is a single Root Zone, all such label generation rules for all the scripts must be merged into a single reference, which is called the Label Generation Rule-set (LGR). The ICANN community has established a procedure to develop the LGR for the Root Zone. This procedure is divided into three steps:
- The basis is a subset of Unicode code points which may be appropriate for the Root Zone and called the Maximal Starting Repertoire (MSR).
- Communities representing various scripts (e.g., Arabic, Cyrillic, Devanagari, Greek, Chinese, Latin, Thai, etc.) are invited to organize into Generation Panels to start from the MSR and propose the label generation rules (which contains the three types of rules defined above) for their respective scripts.
- A panel of experts called the Integration Panel reviews the script-based proposals developed by the communities. Proposals that meet the criteria in the procedure are then integrated into the LGR by the Integration Panel. The LGR is incrementally built upon until it contains all the necessary scripts.
For Step 1, ICANN recently released the Maximal Starting Repertoire (MSR-1), covering 22 scripts (Arabic, Bengali, Cyrillic, Devanagari, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Lao, Latin, Malayalam, Oriya, Sinhala, Tamil, Telugu and Thai) and containing 32,790 code points short-listed from 97,973 allowable code points from Unicode version 6.3. Work on additional scripts will be completed soon.
Step 2 is now underway. ICANN needs your help in developing proposals to extend the Root Zone LGR to cover each of these scripts. There is a role for everyone: general script community representation as well as volunteers with knowledge of scripts, linguistics, Unicode, IDNA/DNS or policy. Generation Panels are already formed or forming for Arabic, Chinese, Japanese, Korean and Neo-Brahmi scripts.
Volunteering for existing or new script-based Generation Panels is easy. Join a Generation Panel today by emailing firstname.lastname@example.org. Make sure to tell us the language you speak and the script or writing system you want to get involved with. Please help to inform and motivate others to join as well. Help ICANN support top-level domain names in your language!
I end off by sharing a video that I contributed to, with many others, about enabling a multilingual Internet:
Sarmad Hussain is IDN Program Senior Manager at ICANN.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.