[docs] copy&waste problem

Senthil Kumaran senthil at uthcode.com
Fri Mar 9 09:18:00 CET 2012


Hello Hauke,

Yeah, it was pretty confusing. Thanks for catching this. How does this
change sound?

-   When the :const:`LOCALE` and :const:`UNICODE` flags are not
specified, matches
-   any non-whitespace character; this is equivalent to the set ``[^
\t\n\r\f\v]``
-   With :const:`LOCALE`, it will match any character not in this set, and not
-   defined as space in the current locale. If :const:`UNICODE` is
set, this will
-   match anything other than ``[ \t\n\r\f\v]`` and characters marked
as space in
-   the Unicode character properties database.
+   When the :const:`LOCALE` and :const:`UNICODE` flags are not specified,
+   matches any non-whitespace character; this is equivalent to the set ``[^
+   \t\n\r\f\v]`` With :const:`LOCALE`, it will match the above set and any
+   non-space character in the current locale. If :const:`UNICODE` is set, the
+   above set ``[^ \t\n\r\f\v]`` and characters not marked as space in the
+   Unicode character properties database.

 ``\w``
    When the :const:`LOCALE` and :const:`UNICODE` flags are not
specified, matches
@@ -381,8 +381,8 @@
    any non-alphanumeric character; this is equivalent to the set
``[^a-zA-Z0-9_]``.
    With :const:`LOCALE`, it will match any character not in the set
``[0-9_]``, and
    not defined as alphanumeric for the current locale. If
:const:`UNICODE` is set,
-   this will match anything other than ``[0-9_]`` and characters marked as
-   alphanumeric in the Unicode character properties database.
+   this will match anything other than ``[0-9_]`` plus characters classied as
+   not alphanumeric in the Unicode character properties database.


Hope the rewrite is less confusing.

We can also include this sentence somewhere.

Both re.LOCALE and re.UNICODE is specified together,in that case
re.LOCALE would be matched first and the re.UNICODE.


-- 
Senthil


On Wed, Mar 7, 2012 at 11:52 AM, Hauke Rehr <homo_laber at yahoo.de> wrote:
> Hello,
>
> I found a bug on library/re.html
>  -> just before subsection 7.2.2
>
> talking about character classes, the descriptions of the complements of \s
> and \w - that is: \S and \W - should read
>
> […] With LOCALE, it will match any character in this set not defined as
> space in the current locale. If UNICODE is set, this will match anything
> other than [ \t\n\r\f\v] not marked as space in the Unicode character
> properties database.
>
> instead of
>
> […] With LOCALE, it will match any character not in this set, and not
> defined as space in the current locale. If UNICODE is set, this will match
> anything other than [ \t\n\r\f\v] and characters marked as space in the
> Unicode character properties database.
>
>
> the same holds for \W, but there’s less to fix; I won’t repeat the current
> wording this time, but only provide a correction:
>
> […] If UNICODE is set, this will match anything other than [0-9_] not marked
> as alphanumeric in the Unicode character properties database.
>
>
> … and while we’re at it: add a sentence about what happens if both flags are
> given (I guess they don’t interfere badly but both apply as expected)
>
>
> Hauke Rehr
> (Jena, Germany)
> _______________________________________________
> docs mailing list
> docs at python.org
> http://mail.python.org/mailman/listinfo/docs


More information about the docs mailing list