[issue14258] Better explain re.LOCALE and re.UNICODE for \S and \W
Senthil Kumaran
report at bugs.python.org
Fri Apr 6 07:38:26 CEST 2012
Senthil Kumaran <senthil at uthcode.com> added the comment:
Well, I would like to correct this further and add clarification based on the current implementation (_sre.c)
The definition of LOCALE Space is this -
#define SRE_LOC_IS_SPACE(ch) (!((ch) & ~255) ? isspace((ch)) : 0)
And the definition of NON_SPACE category is a negation of space. That's it.
Now, given that definition, we see for the character values higher than 255, the check is not made at all. Is it simple ascii isspace is considered when the LOCALE flag is set. And in effect, re.LOCALE flag has not extra effect on matching of space or non-white space character.
After realizing this, I propose the following changes attached in the patch as a documentation fix.
----------
keywords: +patch
resolution: fixed ->
status: closed -> open
Added file: http://bugs.python.org/file25138/issue14258.diff
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14258>
_______________________________________
More information about the Python-bugs-list
mailing list