Code: Select all
ident = (letter | "_") {letter | "_" | digit}.
letter = "A" .. "Z" | "a" .. "z" | "À".."Ö" | "Ø".."ö" | "ø".."ÿ".
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9".
I suggest (open for discussion) that a letter is any Unicode code point not including punctuation or digit where now those quantities (it seems to me) are specified as
Code: Select all
Unicode Punctuation and Digits
(extracted from: http://www.fileformat.info/info/unicode/category/Nd/list.htm)
U+2000..U+206F General Punctuation
U+2E00..U+2E7F Supplemental Punctuation
U+0030..U+0039 DIGIT
U+0660 .U+0669 ARABIC-INDIC DIGIT
U+06F0..U+06F9 EXTENDED ARABIC-INDIC DIGIT
U+07C0..U+07C9 NKO DIGIT
U+0966..U+096F DEVANAGARI DIGIT
U+09E6..U+09EF BENGALI DIGIT
U+0A66..U+0A6F GURMUKHI DIGIT
U+0AE6..U+0AEF GUJARATI DIGIT
U+0B66..U+0B6F ORIYA DIGIT
U+0BE6..U+0BEF TAMIL DIGIT
U+0C66..U+0C6F TELUGU DIGIT
U+0CE6..U+0CEF KANNADA DIGIT
U+0D66..U+0D6F MALAYALAM DIGIT
U+0DE6..U+0DEF SINHALA LITH DIGIT
U+0E50..U+0E59 THAI DIGIT
U+0ED0..U+0ED9 LAO DIGIT
U+0F20..U+0F29 TIBETAN DIGIT
U+1040..U+1049 MYANMAR DIGIT
U+1090..U+1099 MYANMAR SHAN DIGIT
U+17E0..U+17E9 KHMER DIGIT
U+1810..U+1819 MONGOLIAN DIGIT
U+1946..U+194F LIMBU DIGIT
U+19D0..U+19D9 NEW TAI LUE DIGIT
U+1A80..U+1A89 TAI THAM HORA DIGIT
U+1A90..U+1A99 TAI THAM THAM
U+1B50..U+1B59 BALINESE DIGIT
U+1BB0..U+1BB9 SUNDANESE DIGIT
U+1C40..U+1C49 LEPCHA DIGIT
U+1C50..U+1C59 OL CHIKI DIGIT
U+A620..U+A629 VAI DIGIT
U+A8D0..U+A8D9 SAURASHTRA DIGIT
U+A900..U+A909 KAYAH LI DIGIT
U+A9D0..U+A9D9 JAVANESE DIGIT
U+A9F0..U+A9F9 MYANMAR TAI LAING DIGIT
U+AA50..U+AA59 CHAM DIGIT
U+ABF0..U+ABF9 MEETEI MAYEK DIGIT
U+FF10..U+FF19 FULLWIDTH DIGIT
U+104A0..U+104A8 OSMANYA DIGIT
U+104A9..U+1106F OSMANYA DIGIT
U+110F0..U+110F9 SORA SOMPENG DIGIT
U+11136..U+1113F CHAKMA DIGIT
U+111D0..U+111D9 SHARADA DIGIT
U+112F0..U+112F9 KHUDAWADI DIGIT
U+114D0..U+114D9 TIRHUTA DIGIT
U+11650..U+11659 MODI DIGIT
U+116C0..U+116C9 TAKRI DIGIT
U+118E0..U+118E9 WARANG CITI DIGIT
U+16A60..U+16A69 MRO DIGIT
U+16B50..U+16B59 PAHAWH HMONG DIGIT
U+1D7CE..U+1D7D7 MATHEMATICAL BOLD DIGIT
U+1D7D8..U+1D7E1 MATHEMATICAL DOUBLE-STRUCK DIGIT
U+1D7E2..U+1D7EB MATHEMATICAL SANS-SERIF DIGIT
U+1D7EC..U+1D7F5 MATHEMATICAL SANS-SERIF BOLD DIGIT
U+1D7F6..U+1D7FF MATHEMATICAL MONOSPACE DIGIT