Public Comment
Reference Label Generation Rulesets (LGRs) for the Second Level
Open Date
24 August 2020 23:59 UTC
Close Date
15 October 2020 23:59 UTC
Staff Report Due
5 November 2020 23:59 UTC
Brief Overview
Purpose: To improve the transparency and consistency of the Internationalized Domain Name (IDN) table review process to facilitate the registry operations of new gTLDs, ICANN has developed additional reference IDN tables in machine-readable format, called reference Label Generation Rulesets (LGRs) for the second level. The reference IDN tables are based on the Guidelines for Developing Reference Label Generation Rules (LGRs), which were finalized after community review. These reference LGRs will be used in reviewing IDN tables submitted by the gTLD registries, e.g. through the Registry Service Evaluation Policy (RSEP) process.
Current Status: ICANN org had published reference second-level LGRs for multiple languages. Additional reference LGRs have been developed based on the detailed analysis and finalized solutions by the script community for the Root Zone Label Generation Rules (RZ-LGRs). Seventeen (17) LGRs are being released for Public Comment including Bangla, Devanagari, Ethiopic, Georgian, Gujarati, Gurmukhi, Kannada, Khmer, Lao, Malayalam, Oriya, Tamil and Telugu script-based LGRs, and Arabic, Chinese, Hindi, and Thai language-based LGRs.
Next Steps: Based on the community input, these reference LGRs will be finalized and published for the use of gTLD registry operators to consult while they design their IDN tables. These reference LGRs will also be used in reviewing the IDN tables submitted by the gTLD registries.
Section I: Description and Explanation
The reference LGRs are developed in the context of either a language or a script. The script-based reference LGRs are developed based on the detailed analysis and finalized solutions by the community in the Root Zone Label Generation Rules (RZ-LGRs). The language based LGRs are also developed based on the solution available for its script in the RZ-LGRs. These include seventeen (17) LGRs: Bangla, Devanagari, Ethiopic, Georgian, Gujarati, Gurmukhi, Kannada, Khmer, Lao, Malayalam, Oriya, Tamil and Telugu script-based LGRs and Arabic, Chinese, Hindi and Thai language-based LGRs. Additional languages and scripts will be added later, as available and needed. The relevant script community has been consulted while finalizing these reference LGRs.
The gTLD registry operators may consult these reference LGRs while they design their IDN tables to promote consistency. A registry would choose the set of code points and associated variant code points and rules which best serves its end users. An IDN table can be deviate from the reference LGR, motivated by the fact that registries would like to remain competitive by offering innovative solutions to address various end user needs. These reference LGRs will also be used in reviewing the IDN tables submitted by the gTLD registries, contributing to the transparency of the reviewing process.
Section II: Background
The registries are generally encouraged to collaborate in defining common language-based or script-based IDN tables to allow for consistency for end users. There are multiple formats for developing IDN tables. The IDN tables used by each gTLD and some ccTLDs are posted at the IANA Repository for IDN Practices. During the New gTLD Program's Pre-Delegation Testing (PDT), ICANN org has noted a large number of IDN Table submissions. The machine-readable format defined in RFC7940 Representing Label Generation Rulesets Using XML allows the machine processing for repertoire, variant definitions, and the rules which would improve the consistency of the IDN Table review in Pre-Delegation Testing (PDT) and the Registry Service Evaluation Policy (RSEP) process.
The process to develop these reference LGRs, as detailed in the guidelines, ensures both linguistic and technical expert input are incorporated in the reference LGRs which will be finalized after the Public Comment process. These reference LGRs are intended to be comprehensive enough that they do not require further additions to be useful. At the same time, they should be relatively conservative. This should enable registries to adopt these LGRs either as is, or to take them as the basis for further modifications.
Section III: Relevant Resources
The following reference LGRs for the second level are published for Public Comments.
- Overview and Summary
- Arabic Language Reference LGR (XML, HTML)
- Bangla Script Reference LGR (XML, HTML)
- Chinese Language Reference LGR (XML, HTML)
- Devanagari Script Reference LGR (XML, HTML)
- Ethiopic Script Reference LGR (XML, HTML)
- Georgian Script Reference LGR (XML, HTML)
- Gujarati Script Reference LGR (XML, HTML)
- Gurmukhi Script Reference LGR (XML, HTML)
- Hindi Language Reference LGR (XML, HTML)
- Kannada Script Reference LGR (XML, HTML)
- Khmer Script Reference LGR (XML, HTML)
- Lao Script Reference LGR (XML, HTML)
- Malayalam Script Reference LGR (XML, HTML)
- Oriya Script Reference LGR (XML, HTML)
- Tamil Script Reference LGR (XML, HTML)
- Telugu Script Reference LGR (XML, HTML)
- Thai Language Reference LGR (XML, HTML)
These files can be collectively downloaded with this package.
Section IV: Additional Information
- RFC7940 Representing Label Generation Rulesets Using XML (Label Generation Rules, LGR): https://tools.ietf.org/html/rfc7940
- Guidelines for Developing Reference LGRs for the Second Level (version 27 May 2020):
https://www.icann.org/en/system/files/files/lgr-guidelines-second-level-27may20-en.pdf - Finalized Proposals for Root Zone Label Generation Ruleset (RZ-LGR) by the Generation Panels:
https://www.icann.org/resources/pages/lgr-proposals-2015-12-01-en - LGR Toolset: https://www.icann.org/resources/pages/lgr-toolset-2015-06-21-en
Comments Closed
Report of Public Comments