Fredrik Lundh wrote:
mal wrote:
Py_UNICODE_ISLOWER || Py_UNICODE_ISUPPER || Py_UNICODE_ISTITLE || Py_UNICODE_ISDIGIT
This will give you all cased chars along with all digits; it ommits the non-cased ones.
but of course...
It's a good start, but probably won't cover the full range of letters + numbers.
Perhaps we need another table for isalpha in unicodectype.c ? (Or at least one which defines all non-cased letters.)
+1 from me (SRE needs this, and it doesn't really make much sense to add unicode tables to SRE just because the built-in ones are slightly incomplete...)
how about this plan:
-- you add a Py_UNICODE_ALPHA to unicodeobject.h asap, which does exactly that (or I can do that, if you prefer). (and maybe even a Py_UNICODE_ALNUM)
Ok, I'll add Py_UNICODE_ISALPHA and Py_UNICODE_ISALNUM (first with approximations of the sort you give above and later with true implementations using tables in unicodectype.c) on Monday... gotta run now.
-- I change SRE to use that asap.
-- you, I, or someone else add a better implementation, some other day.
</F>
Nice weekend :) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/