Emacs里用正则表达式匹配中文

这是个非常好用的功能 也是其他正则表达式里面不具备或不能优雅实现的(比如有些地方用[\u4e00-\u9fa5]来匹配中文 你觉得你能记住么)

但就这个方便的功能我也记不住 而且手册也看不大懂

正确写法:

Emacs里正则匹配中文的写法是\cc

手册里面是这么说的

‘\cC’
matches any character that belongs to the category C. For example,
‘\cc’ matches Chinese characters, ‘\cg’ matches Greek characters,
etc. For the description of the known categories, type ‘M-x
describe-categories ’.

‘\CC’
matches any character that does not belong to category C.

15.7 Backslash in Regular Expressions

那把这个结果也贴出来罢 反正这个函数我也记不住

Legend of category mnemonics (see the tail for the longer description)
:space for indent 9:semivowel lower R:Right-to-left … k:Katakana
.:Base <:Not at eol Y:2-byte Cyrillic l:Latin
0:consonant >:Not at bol ^:Combining o:Lao
1:base vowel A:2-byte alnum a:ASCII q:Tibetan
2:upper diacritic C:2-byte han b:Arabic r:Roman
3:lower diacritic G:2-byte Greek c:Chinese t:Thai
4:combining tone H:2-byte Hiragana e:Ethiopic v:Viet
5:symbol I:Indian Glyphs g:Greek w:Hebrew
6:digit K:2-byte Katakana h:Korean y:Cyrillic
7:vowel diacritic L:Left-to-right … i:Indian |:line breakable
8:vowel-signs N:2-byte Korean j:Japanese

M-x describe-categories

只有开头 后面那些太乱了 放上也没人会看