RegEx for Unicode
Fabian Cenedese
Cenedese at indel.ch
Thu Feb 1 00:47:46 PST 2007
Hi
This is a question about a regex and may be considered offtopic. But
I'm using the inbuilt implementation with Unicode and was wondering
about an optimization.
I want to evaluate the Chinese dictionary file from http://www.mandarintools.com/cedict.html
The lines are of the form:
Traditional Simplified [pinyin] /English equivalent 1/equivalent 2.../
The first words are in Chinese HanZi so [a-zA-Z] won't work. I came
up with this that works:
wxT("(.*) (.*) \\[(.*)\\] /(.*)/$")
or also
wxT("([^ ]*) ([^ ]*) \\[(.*)\\] /(.*)/$")
But are there better methods for working in Unicode except .*
and [^ ]* for foreign languages/chars?
Thanks
bye Fabi
More information about the wx-users
mailing list