[issue10567] Unicode space character \u200b unrecognised a space

Martin v. Löwis report at bugs.python.org
Sun Nov 28 20:31:57 CET 2010

Martin v. Löwis <martin at v.loewis.de> added the comment:

>> In 2.6, there was a manually maintained list, probably dating back to before Unicode 4.0. 
> That's not quite correct: Python 1.6.x - 2.5.x used tables for the
> PyUnicode_ISSPACE() function that were created from the Unicode database.

That used to be the case until r39757, when you made this change:

r39757 | lemburg | 2005-10-20 21:06:35 +0200 (Do, 20. Okt 2005) | 7 Zeilen
Geänderte Pfade:
   M /python/trunk/Objects/unicodectype.c

Enhance the performance of two important Unicode character
type lookups: whitespace and linebreak.

These lookup tables are from the Python 1.6 version with the addition
of the 205F code point which was added as whitespace code point to
Unicode since then.


In 2.5 and 2.6, there was no table lookup anymore, but a switch
statement. Not sure how you arrived at the code; the commit message
doesn't say (but the wording suggests it was manually computed).
It was not updated in 2.6.


Python tracker <report at bugs.python.org>

More information about the Python-bugs-list mailing list