[docs] [issue18779] Misleading documentations and comments in regular expression HOWTO
report at bugs.python.org
Mon Aug 19 12:19:36 CEST 2013
Vajrasky Kok added the comment:
In Lib/re.py, starting from line 77 (Python 3.4):
\w Matches any alphanumeric character; equivalent to [a-zA-Z0-9_]
in bytes patterns or string patterns with the ASCII flag.
In string patterns without the ASCII flag, it will match the
range of Unicode alphanumeric characters (letters plus digits
With LOCALE, it will match the set [0-9_] plus characters defined
as letters for the current locale.
The prelude is "Matches any alphanumeric character;".
Yet, in any case (bytes, string patterns with ascii flag, string patterns without the ascii flag, strings with locale), the underscore is always included.
Then why don't we change the prelude to "Matches any alphanumeric character and underscore character;"? In the description we explain the alphanumeric depending on it's unicode or not can be [A-Za-z0-9] or wider than that.
The description is already okay but the prelude is misleading readers.
Python tracker <report at bugs.python.org>
More information about the docs