[Python-Dev] unicode alphanumerics
Mon, 03 Jul 2000 17:06:05 GMT
>"M.-A. Lemburg" wrote:
>> Fredrik Lundh wrote:
>> > how about this plan:
>> > -- you add a Py_UNICODE_ALPHA to unicodeobject.h asap,
>> > which does exactly that (or I can do that, if you prefer).
>> > (and maybe even a Py_UNICODE_ALNUM)
>> Ok, I'll add Py_UNICODE_ISALPHA and Py_UNICODE_ISALNUM
>> (first with approximations of the sort you give above and
>> later with true implementations using tables in unicodectype.c)
>> on Monday... gotta run now.
>> > -- I change SRE to use that asap.
>> > -- you, I, or someone else add a better implementation,
>> > some other day.
>I've just looked into this... the problem here is what to
>consider as being "alpha" and what "numeric".
>I could add two new tables for the characters with category 'Lo'
>(other letters, not cased) and 'Lm' (letter modifiers)
>to match all letters in the Unicode database, but those
>tables have some 5200 entries (note that there are only 804 lower
>case letters and 686 upper case ones).
In JDK1.3, Character.isLetter(..) and Character.isDigit(..) are
I guess that java uses the extra huge tables.