sre is broken in SuSE 9.2

Serge Orlov Serge.Orlov at gmail.com
Sat Feb 12 03:17:08 EST 2005


Fredrik Lundh wrote:
> Serge Orlov wrote:
>
>>>>>> re.compile(ur'\w+', re.U).findall(u'\xb5\xba\xe4\u0430')
>>>>>> [u'\xb5\xba\xe4\u0430']
>>
>> I can't find the strict definition of isalpha, but I believe average
>> C program shouldn't care about the current locale alphabet, so
>> isalpha is a union of all supported characters in all alphabets
>
> nope.  isalpha() depends on the locale, as does all other ctype
> functions (this also applies to wctype, on some platforms).

I mean "all supported characters in all alphabets [in the current
locale]". For example in ru_RU.koi8-r isalpha should return
true for characters in English and Russian alphabets. In
ru_RU.koi8-u -- for characters in English, Russia and Ukrain
alphabets, in ru_RU.utf-8 -- for all supported by the implementation
alphabetic characters in unicode. IMHO iswalpha in POSIX
locale can return true for all alphabetic characters in unicode
instead of being limited by English alphabet.

  Serge.

true in 





More information about the Python-list mailing list