Reference LGR for script: Lao (Laoo) | lgr-second-level-lao-script-24aug20-en |
---|
This document is mechanically formatted from the above XML file for the LGR. It provides additional summary data and explanatory text. The XML file remains the sole normative specification of the LGR.
Date | 2020-08-24 |
---|---|
LGR Version | 3 |
Language | und-Laoo |
Unicode Version | 6.3.0 |
This document specifies a reference set of Label Generation Rules (LGR) for the Lao script for the second level. The starting point for the development of this LGR can be found in the related Root Zone LGR [RZ-LGR-3-Laoo]. For details and additional background on the script, see "Proposal for a Lao Script Root Zone LGR [Proposal-Lao]". The format of this file follows [RFC 7940].
This is a DRAFT document released for public comments and not final. Please see the announcement on the ICANN website for public comments on the Second Level Reference LGRs for details on how to submit comments.
The repertoire contains 51 code points for letters; in addition, the sequence 0EB2 0EB0 has been defined to facilitate implementation of WLE rule follows-vafter-context as a context rule. The repertoire only includes code points used by languages that are actively written in the Lao script. The repertoire is a subset of [Unicode 6.3]. For details, see Section 5 “Repertoire” in [Proposal-Lao]. (The proposal cited has been adopted for the Lao script portion of the Root Zone LGR.)
For the second level, the repertoire has been augmented with the ASCII digits, U+0030 (0) to U+0039 (9), Lao digits, U+0ED0 (໐) to U+0ED9 (໙), and U+002D (-) HYPHEN-MINUS for a total of 73 repertoire elements.
Each code point or range is tagged with the script or scripts that the code point is used with, one or more categories, and one or more references documenting sufficient justification for inclusion in the repertoire, see "References" below.
This LGR defines no variants for letters. See Section 6, "Variants" in [Proposal-Lao].
Digit Variants: All Lao digits are treated as semantic variants of the corresponding common (ASCII) digits. By transitivity, they are also semantic variants of any native digits in scripts that also include the common digits. Such transitive relations are deemed to exist implicitly but are not listed explicitly in each reference LGR. (Omitting the listing of these other cross script digit variants does not affect index variant calculation, as the ASCII digit variant being smallest would always be the index variant.) There is a strong resemblance between Thai and Khmer digits, and certain Lao digits. In addition, Lao digit ZERO is a cross-script homoglyph or near homoglyph of digit ZERO in many other scripts; all of these are already implicit semantic variants by transitivity and therefore not listed here. To keep digit variant sets manageable in zones where multiple scripts are present, no attempt has been made at identifying cross-script variants among digits of different numeric value or between a digit in one script and a letter in another, such as between digit zero and Latin letter 'o'. Other mechanisms may be required to prevent homograph labels.
Consonants: In regular syllables, consonants occur in limited combinations. However, arbitrary combinations are used for acronyms. The LGR therefore considers the restriction on syllabic combinations a matter of spelling and does not enforce them. Consonants may be followed by a semi-consonant mark. Some consonants have been given the tag "Cf", which indicates final consonants. See Section 5, "Consonants" in [Proposal-Lao].
Vowels: Vowels are divided into vowel-above, vowel-before, vowel-below and vowel-after so as to enforce some of the syllable structure using context rules. However, many details have been considered spelling issues and, for simplification, are not modeled in this LGR. See Section 5 in [Proposal-Lao].
Semi-consonant: The character U+0EBC (ຼ) LAO SEMIVOWEL SIGN LO follows consonants (see Section 5 in [Proposal-Lao]).
Tone-mark: Any of four tone marks can follow a consonant or vowel-above or vowel-below (see Section 5 in [Proposal-Lao]).
Signs: The character U+0ECC (໌) LAO CANCELLATION MARK follows a final consonant (Cf). The character U+0EC6 (ໆ) LAO KO LA is a repetition mark that can only occur up to 3 times at the end of the label (See Section 5 in [Proposal-Lao]).
Lao Digits: U+0E50 (๐) to U+0E59 (๙) are a set of Lao-specific digits. They are used in alternation with the European (common) digits.
Common Digits: U+0030 (0) to U+0039 (9) are the set of digits from the ASCII range.
Actions include the default actions for LGRs as well as that needed to invalidate labels with misplaced combining marks. They are marked with ⍟. For a description see [RFC 7940].
Rules provided in the LGR as described in Section 7 of [Proposal-Lao] reasonably restrict labels so that they conform to Lao syllable structure. These constraints are presented exclusively as context rules.
The rules are:
No context rules apply to “consonant” code points. For discussion, see Section 5.1, “Consonants” in [Proposal-Lao].
This reference LGR for Lao for the 2nd Level has been developed by Michel Suignard and Asmus Freytag, based on the Root Zone LGR for Lao and information contained or referenced therein, see [RZ-LGR-3-Laoo]. Suitable extensions for the second level have been applied according to the [Guidelines]. The original proposal for a Root Zone LGR for the Lao script, that this reference LGR is based on, was developed by the Lao Generation Panel. For more information on methodology and contributors to the underlying Root Zone LGR, see Sections 4 and 8 in [Proposal-Lao], as well as [RZ-LGR-Overview].
The following general references are cited in this document:
For references consulted particularly in designing the repertoire for the Lao script for the second level. please see details in the Table of References below. Reference [0] refers to Unicode Standard version in which corresponding code points were initially encoded. References [201], [202], [203], [204], 205], [206], & [207] correspond to sources justifying the inclusion of or classification for the corresponding code points. Entries in the table may have multiple source reference values.
Number of elements in Repertoire | 73 | ||||
---|---|---|---|---|---|
Number of code points for each script |
|
||||
Number of code points | 72 | ||||
Number of sequences | 1 | ||||
Longest code point sequence | 2 |
The following table lists the repertoire by code point (or code point sequence). The data in the Script and Name column are extracted from the Unicode character database. Where a comment in the original LGR is equal to the character name, it has been suppressed.
For any code point or sequence for which a variant is defined, additional information is provided in the Variants column. See also the legend provided below the table.
Code Point |
Glyph | Script | Name | Ref | Tags | Required Context | Variants | Comment |
---|---|---|---|---|---|---|---|---|
U+002D | - | Common | HYPHEN-MINUS | [0] | Hyphen | not: hyphen-minus-disallowed | ⍟ | |
U+0030 | 0 | Common | DIGIT ZERO | [0] | Common-digit | set 1 | ⍟ | |
U+0031 | 1 | Common | DIGIT ONE | [0] | Common-digit | set 2 | ⍟ | |
U+0032 | 2 | Common | DIGIT TWO | [0] | Common-digit | set 3 | ⍟ | |
U+0033 | 3 | Common | DIGIT THREE | [0] | Common-digit | set 4 | ⍟ | |
U+0034 | 4 | Common | DIGIT FOUR | [0] | Common-digit | set 5 | ⍟ | |
U+0035 | 5 | Common | DIGIT FIVE | [0] | Common-digit | set 6 | ⍟ | |
U+0036 | 6 | Common | DIGIT SIX | [0] | Common-digit | set 7 | ⍟ | |
U+0037 | 7 | Common | DIGIT SEVEN | [0] | Common-digit | set 8 | ⍟ | |
U+0038 | 8 | Common | DIGIT EIGHT | [0] | Common-digit | set 9 | ⍟ | |
U+0039 | 9 | Common | DIGIT NINE | [0] | Common-digit | set 10 | ⍟ | |
U+0E81 | ກ | Lao | LAO LETTER KO | [0], [201], [204] | Cf, consonant | Lao | ||
U+0E82 | ຂ | Lao | LAO LETTER KHO SUNG | [0], [201], [204] | consonant | Lao | ||
U+0E84 | ຄ | Lao | LAO LETTER KHO TAM | [0], [201], [204] | consonant | Lao | ||
U+0E87 | ງ | Lao | LAO LETTER NGO | [0], [201], [204] | Cf, consonant | Lao | ||
U+0E88 | ຈ | Lao | LAO LETTER CO | [0], [201], [204] | consonant | Lao | ||
U+0E8A | ຊ | Lao | LAO LETTER SO TAM | [0], [201], [204] | Cf, consonant | Lao | ||
U+0E8D | ຍ | Lao | LAO LETTER NYO | [0], [201], [204] | Cf, consonant | Lao | ||
U+0E94 | ດ | Lao | LAO LETTER DO | [0], [201], [204] | Cf, consonant | Lao | ||
U+0E95 | ຕ | Lao | LAO LETTER TO | [0], [201], [204] | consonant | Lao | ||
U+0E96 | ຖ | Lao | LAO LETTER THO SUNG | [0], [201], [204] | consonant | Lao | ||
U+0E97 | ທ | Lao | LAO LETTER THO TAM | [0], [201], [204] | Cf, consonant | Lao | ||
U+0E99 | ນ | Lao | LAO LETTER NO | [0], [201], [204] | Cf, consonant | Lao | ||
U+0E9A | ບ | Lao | LAO LETTER BO | [0], [201], [204] | Cf, consonant | Lao | ||
U+0E9B | ປ | Lao | LAO LETTER PO | [0], [201], [204] | consonant | Lao | ||
U+0E9C | ຜ | Lao | LAO LETTER PHO SUNG | [0], [201], [204] | consonant | Lao | ||
U+0E9D | ຝ | Lao | LAO LETTER FO FON | [0], [201], [204] | consonant | = lao letter fo sung; Lao | ||
U+0E9E | ພ | Lao | LAO LETTER PHO TAM | [0], [201], [204] | consonant | Lao | ||
U+0E9F | ຟ | Lao | LAO LETTER FO FAY | [0], [201], [204] | Cf, consonant | = lao letter fo tam; Lao | ||
U+0EA1 | ມ | Lao | LAO LETTER MO | [0], [201], [204] | Cf, consonant | Lao | ||
U+0EA2 | ຢ | Lao | LAO LETTER YO | [0], [201], [204] | consonant | Lao | ||
U+0EA3 | ຣ | Lao | LAO LETTER RO | [0], [204] | Cf, consonant | = lao letter lo rada; Lao | ||
U+0EA5 | ລ | Lao | LAO LETTER LO | [0], [201], [204] | Cf, consonant | = lao letter lo ling; Lao | ||
U+0EA7 | ວ | Lao | LAO LETTER WO | [0], [201], [204], [205] | Cf, consonant | Lao | ||
U+0EAA | ສ | Lao | LAO LETTER SO SUNG | [0], [201], [204] | Cf, consonant | Lao | ||
U+0EAB | ຫ | Lao | LAO LETTER HO SUNG | [0], [201], [204] | consonant | Lao | ||
U+0EAD | ອ | Lao | LAO LETTER O | [0], [201], [204], [205] | consonant | Lao | ||
U+0EAE | ຮ | Lao | LAO LETTER HO TAM | [0], [201], [204] | consonant | Lao | ||
U+0EB0 | ະ | Lao | LAO VOWEL SIGN A | [0], [201], [205], [206] | vowel-after | follows-C-tonemark-vabove | Lao | |
U+0EB1 | ັ | Lao | LAO VOWEL SIGN MAI KAN | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao | |
U+0EB2 | າ | Lao | LAO VOWEL SIGN AA | [0], [201], [205], [206] | vowel-after | follows-C-tonemark-vabove | Lao | |
U+0EB2 U+0EB0 | າະ | {Lao} | LAO VOWEL SIGN AA + LAO VOWEL SIGN A | [205] | [vowel-after] + [vowel-after] | follows-vbefore-consonant-cluster | Lao | |
U+0EB4 | ິ | Lao | LAO VOWEL SIGN I | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao | |
U+0EB5 | ີ | Lao | LAO VOWEL SIGN II | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao | |
U+0EB6 | ຶ | Lao | LAO VOWEL SIGN Y | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao | |
U+0EB7 | ື | Lao | LAO VOWEL SIGN YY | [0], [201], [205], [206] | vowel-above | follows-main-consonant | Lao | |
U+0EB8 | ຸ | Lao | LAO VOWEL SIGN U | [0], [201], [205], [206] | vowel-below | follows-main-consonant | Lao | |
U+0EB9 | ູ | Lao | LAO VOWEL SIGN UU | [0], [201], [205], [206] | vowel-below | follows-main-consonant | Lao | |
U+0EBB | ົ | Lao | LAO VOWEL SIGN MAI KON | [0], [205] | vowel-above | follows-main-consonant | Lao | |
U+0EBC | ຼ | Lao | LAO SEMIVOWEL SIGN LO | [0], [201], [205], [206] | semi-consonant | follows-consonant | = lao semiconsonant lo; Lao | |
U+0EBD | ຽ | Lao | LAO SEMIVOWEL SIGN NYO | [0], [201], [205] | vowel-after | follows-C-tonemark-vabove | = lao semivowel ia; Lao | |
U+0EC0 | ເ | Lao | LAO VOWEL SIGN E | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao | |
U+0EC1 | ແ | Lao | LAO VOWEL SIGN EI | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao | |
U+0EC2 | ໂ | Lao | LAO VOWEL SIGN O | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao | |
U+0EC3 | ໃ | Lao | LAO VOWEL SIGN AY | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao | |
U+0EC4 | ໄ | Lao | LAO VOWEL SIGN AI | [0], [201], [205], [206] | vowel-before | precedes-consonant | Lao | |
U+0EC6 | ໆ | Lao | LAO KO LA | [0], [203] | sign | repetition-mark-limit | = lao may sam; Lao | |
U+0EC8 | ່ | Lao | LAO TONE MAI EK | [0], [202] | tone-mark | follows-C-vabove-vbelow | Lao | |
U+0EC9 | ້ | Lao | LAO TONE MAI THO | [0], [202] | tone-mark | follows-C-vabove-vbelow | Lao | |
U+0ECA | ໊ | Lao | LAO TONE MAI TI | [0], [202] | tone-mark | follows-C-vabove-vbelow | Lao | |
U+0ECB | ໋ | Lao | LAO TONE MAI CATAWA | [0], [202] | tone-mark | follows-C-vabove-vbelow | = lao tone mai jattawa; Lao | |
U+0ECC | ໌ | Lao | LAO CANCELLATION MARK | [0], [207] | sign | follows-Cf | = lao mark mai ka lan; Lao | |
U+0ECD | ໍ | Lao | LAO NIGGAHITA | [0], [201], [205], [206] | vowel-above | follows-main-consonant | = lao vowel sign or; Lao | |
U+0ED0 | ໐ | Lao | LAO DIGIT ZERO | [0] | Lao-digit | set 1 | THAI DIGIT ZERO | |
U+0ED1 | ໑ | Lao | LAO DIGIT ONE | [0] | Lao-digit | set 2 | THAI DIGIT ONE | |
U+0ED2 | ໒ | Lao | LAO DIGIT TWO | [0] | Lao-digit | set 3 | THAI DIGIT TWO | |
U+0ED3 | ໓ | Lao | LAO DIGIT THREE | [0] | Lao-digit | set 4 | THAI DIGIT THREE | |
U+0ED4 | ໔ | Lao | LAO DIGIT FOUR | [0] | Lao-digit | set 5 | THAI DIGIT FOUR | |
U+0ED5 | ໕ | Lao | LAO DIGIT FIVE | [0] | Lao-digit | set 6 | THAI DIGIT FIVE | |
U+0ED6 | ໖ | Lao | LAO DIGIT SIX | [0] | Lao-digit | set 7 | THAI DIGIT SIX | |
U+0ED7 | ໗ | Lao | LAO DIGIT SEVEN | [0] | Lao-digit | set 8 | THAI DIGIT SEVEN | |
U+0ED8 | ໘ | Lao | LAO DIGIT EIGHT | [0] | Lao-digit | set 9 | THAI DIGIT EIGHT | |
U+0ED9 | ໙ | Lao | LAO DIGIT NINE | [0] | Lao-digit | set 10 | THAI DIGIT NINE |
Throughout this LGR, a code point sequence may be annotated with a string in ALL CAPS that is constructed on the same principle as a name for a Unicode Named Sequence. No claim is made that a sequence thus annotated is in fact a named sequence, nor that the annotation in such case actually corresponds to the formal name of a named sequence.
Number of variant sets | 10 | ||
---|---|---|---|
Largest variant set | 2 | ||
Variants by Type |
|
The following tables list all variant sets defined in this LGR, except for singleton sets. Each table lists all variant mapping pairs of the set; one per row. Mappings are assumed to be symmetric: each row documents both forward (→) and reverse (←) mapping directions. In each table, the mappings are sorted by Source value in ascending code point order; shading is used to group mappings from the same source code point or sequence.
Where the type of both forward and reverse mappings are the same, a single value is given in the Type column; otherwise the types for forward and reverse mappings, as well as comments and references, are listed above one another. For summary counts, both forward and reverse mappings are always counted separately.
In any LGR with variant specifications that are well behaved, all members within each variant set are defined as variants of each other; the mappings in each set are symmetric and transitive; and all variant sets are disjoint.
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0030 | 0 | 0ED0 | ໐ | ↔ | blocked | ASCII digit variant / Lao digit variant |
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0031 | 1 | 0ED1 | ໑ | ↔ | blocked | ASCII digit variant / Lao digit variant |
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0032 | 2 | 0ED2 | ໒ | ↔ | blocked | ASCII digit variant / Lao digit variant |
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0033 | 3 | 0ED3 | ໓ | ↔ | blocked | ASCII digit variant / Lao digit variant |
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0034 | 4 | 0ED4 | ໔ | ↔ | blocked | ASCII digit variant / Lao digit variant |
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0035 | 5 | 0ED5 | ໕ | ↔ | blocked | ASCII digit variant / Lao digit variant |
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0036 | 6 | 0ED6 | ໖ | ↔ | blocked | ASCII digit variant / Lao digit variant |
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0037 | 7 | 0ED7 | ໗ | ↔ | blocked | ASCII digit variant / Lao digit variant |
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0038 | 8 | 0ED8 | ໘ | ↔ | blocked | ASCII digit variant / Lao digit variant |
Source | Glyph | Target | Glyph | Type | Ref | Comment | |
---|---|---|---|---|---|---|---|
0039 | 9 | 0ED9 | ໙ | ↔ | blocked | ASCII digit variant / Lao digit variant |
The following table lists all named and implicit classes with their definition and a list of their members intersected with the current repertoire (for larger classes, this list is elided).
Name | Definition | Count | Members or Ranges | Ref | Comment |
---|---|---|---|---|---|
Cf | Tag=Cf | 14 | {0E81 0E87 0E8A 0E8D 0E94 0E97 0E99-0E9A 0E9F 0EA1 0EA3 0EA5 0EA7 0EAA} | Any Lao final consonant | |
consonant | Tag=consonant | 27 | {0E81-0E82 0E84 0E87-0E88 0E8A 0E8D 0E94-0E97 0E99-0E9F 0EA1-0EA3 0EA5 0EA7 0EAA-0EAB 0EAD-0EAE} | Any Lao consonant | |
semi-consonant | Tag=semi-consonant | 1 | {0EBC} | Lao semi-consonant LO | |
tone-mark | Tag=tone-mark | 4 | {0EC8-0ECB} | Any Lao one mark | |
vowel-above | Tag=vowel-above | 7 | {0EB1 0EB4-0EB7 0EBB 0ECD} | Any Lao vowel above | |
vowel-below | Tag=vowel-below | 2 | {0EB8-0EB9} | Any Lao vowel below | |
common-digits | Tag=Common-digit | 10 | {0030-0039} | Digits from the ASCII range; ⍟ | |
lao-digits | Tag=Lao-digit | 10 | {0ED0-0ED9} | Lao digits | |
hyphen | Tag=Hyphen | 1 | {002D} | The Hyphen-minus character ⍟ | |
implicit | Tag=sign | 2 | {0EC6 0ECC} | Any character tagged as sign | |
implicit | Tag=vowel-after | 3 | {0EB0 0EB2 0EBD} | Any character tagged as vowel-after | |
implicit | Tag=vowel-before | 5 | {0EC0-0EC4} | Any character tagged as vowel-before | |
implicit | Tag=sc:Laoo | 61 | {0E81-0E82 0E84 0E87-0E88 0E8A 0E8D 0E94-0E97 0E99-0E9F 0EA1-0EA3 0EA5 0EA7 0EAA-0EAB 0EAD-0EAE 0EB0-0EB2 0EB4-0EB9 0EBB-0EBD 0EC0-0EC4 0EC6 0EC8-0ECD 0ED0-...} | Any character tagged as Lao | |
implicit | Tag=sc:Zyyy | 11 | {002D 0030-0039} | Any character tagged as Common |
The following table lists all named rules defined in the LGR and indicates whether they are used as trigger in an action or as context (when or not-when) for a code point or variant.
Name | Regular Expression | Used as Trigger |
Anchor | Used as Context |
Ref | Comment |
---|---|---|---|---|---|---|
leading-combining-mark | (start)[[\p{gc=Mn}] ∪ [∅=\p{gc=Mc}]] |
✔ | [150] | RFC 5891 restrictions on placement of combining marks ⍟ | ||
hyphen-minus-disallowed | (((start))← ⚓)|(⚓ →((end)))|(((start)..[:hyphen:])← ⚓) |
✔ | C | [150] | RFC 5891 restrictions on placement of U+002D (-) ⍟ | |
follows-consonant | ([:consonant:])← ⚓ |
✔ | C | WLE Rule 1: A semi-consonant must follow a consonant | ||
precedes-consonant | ⚓ →([:consonant:]) |
✔ | C | WLE Rule 2: A vowel-before precedes a main consonant cluster | ||
follows-main-consonant | ([:consonant:]|[:semi-consonant:])← ⚓ |
✔ | C | WLE Rule 3: A vowel-above, and vowel-below follow a main consonant C | ||
follows-C-tonemark-vabove | ([:consonant:]|[:semi-consonant:]|[:tone-mark:]|[:vowel-above:])← ⚓ |
✔ | C | WLE Rule 4: A vowel-after follows a main consonant, tone-mark or vowel-above | ||
consonant-cluster | [:consonant:]{1,2}[:semi-consonant:]? |
Defining consonant cluster for WLE Rule 5 | ||||
follows-vbefore-consonant-cluster | (\u0EC0(:consonant-cluster:))← ⚓ |
✔ | C | WLE Rule 5: The sequence U+0EB2 U+0EB0 (າະ) follows a vowel before, and a consonant cluster | ||
follows-C-vabove-vbelow | ([:consonant:]|[:semi-consonant:]|[:vowel-above:]|[:vowel-below:])← ⚓ |
✔ | C | WLE Rule 6: A tone-mark follows a main consonant, vowel-above or vowel-below | ||
follows-Cf | ([:Cf:])← ⚓ |
✔ | C | WLE Rule 7: The sign U+0ECC (໌) can only occur after final consonants | ||
repetition-mark-limit | ⚓ →(\u0EC6{0,2}(end)) |
✔ | C | WLE Rule 8: The sign U+0EC6 (ໆ) can only occur 0 to 3 times at the end of the label | ||
ascii-only-label | (start)[\u002D\u0030-\u0039]+(end) |
✔ | [150] | RFC 5891 restriction requiring at least one non-ASCII code point ⍟ | ||
digit-mixing | ([:common-digits:].*[:lao-digits:])|([:lao-digits:].*[:common-digits:]) |
✔ | restrictions on mixing digits |
The following table lists the actions that are used to assign dispositions to labels and variant labels based on the specified conditions. The order of actions defines their precedence: the first action triggered by a label is the one defining its disposition.
# | Condition | Rule / Variant Set | Disposition | Ref | Comment | |
---|---|---|---|---|---|---|
1 | if label matches | leading-combining-mark | → | invalid | [150] | labels with leading combining marks are invalid ⍟ |
2 | if label matches | ascii-only-label | → | invalid | [150] | ascii-only labels invalid (not IDNs) ⍟ |
3 | if at least one variant is in | {out-of-repertoire-var} | → | invalid | any variant label with a code point out of repertoire is invalid ⍟ | |
4 | if label matches | digit-mixing | → | invalid | a label violating the restriction on digit mixing is invalid | |
5 | if at least one variant is in | {blocked} | → | blocked | any variant label containing blocked variants is blocked ⍟ | |
6 | if each variant is in | {allocatable} | → | allocatable | variant labels with all variants allocatable are allocatable ⍟ | |
7 | if any label (catch-all) | → | valid | catch all (default action) ⍟ |
The following lists the references cited for specific code points, variants, classes, rules or actions in this LGR. For General references refer to the "References" section in the Description.
[0] | The Unicode Standard 1.1 Any code point originally encoded in Unicode 1.1 |
[201] | Lao grammar book published by the Ministry of Education in 1967,
see Appendix B, Figure 1 in [Proposal-Lao] |
[202] | Lao grammar book published by the Ministry of Education in 1967,
see Appendix B, Figure 2 in [Proposal-Lao] |
[203] | Lao grammar book published by the Ministry of Education in 1967,
see Appendix B, Figure 3 in [Proposal-Lao] |
[204] | Lao grammar book published by the Ministry of Education in 2000,
see Appendix B, Figure 4 in [Proposal-Lao] |
[205] | Lao grammar book published by the Ministry of Education in 2000,
see Appendix B, Figure 5 in [Proposal-Lao] |
[206] | Lao grammar book published by the Ministry of Education in 2000,
see Appendix B, Figure 6 in [Proposal-Lao] |
[207] | Lao grammar 1935, see Appendix B, Figure 7 in [Proposal-Lao] |
[150] | RFC 5891, Internationalized Domain Names in Applications (IDNA): Protocol http://tools.ietf.org/html/rfc5891 |