Why is it different about '\s' Matches whitespace and Equivalent to [\t\n\r\f]?
Ned Batchelder
ned at nedbatchelder.com
Thu Jul 10 10:04:40 EDT 2014
On 7/10/14 9:32 AM, fl wrote:
> On Thursday, July 10, 2014 7:18:01 AM UTC-4, MRAB wrote:
>> On 2014-07-10 11:05, rx at gmail.com wrote:
>>
>> It's equivalent to [ \t\n\r\f], i.e. it also includes a space, so
>>
>> either the tutorial is wrong, or you didn't look closely enough. :-)
>>
>>
>> The string starts with ' ', not '\t'.
>>
>>
>>
>>
>>
>> The string starts with ' ', which isn't in the character set.
>>
>>
> The '\s' description is on link:
>
> http://www.tutorialspoint.com/python/python_reg_expressions.htm
>
For some reason, that page shows much of its information twice. The
first occurrence of \s there is:
\s Matches whitespace. Equivalent to [\t\n\r\f].
The second is:
\s Match a whitespace character: [ \t\r\n\f]
The second one is correct. The first is wrong. You might want to send
the author a bug report.
Actually, neither is strictly correct, since as the official docs
(https://docs.python.org/2/library/re.html) say,
\s When the UNICODE flag is not specified, it matches any
whitespace character, this is equivalent to the set [ \t\n\r\f\v].
The LOCALE flag has no extra effect on matching of the space. If
UNICODE is set, this will match the characters [ \t\n\r\f\v] plus
whatever is classified as space in the Unicode character properties
database.
>
> Could you give me an example to use the equivalent pattern?
>
> Thanks
>
--
Ned Batchelder, http://nedbatchelder.com
More information about the Python-list
mailing list