[issue5240] time.strptime fails to match data and format with Unicode whitespaces (Py3)

Ezio Melotti report at bugs.python.org
Fri Feb 13 14:16:18 CET 2009


Ezio Melotti <ezio.melotti at gmail.com> added the comment:

I think you have found the problem, strptime probably uses \s with the
re.ASCII flag and fails to match all the Unicode whitespaces:
>>> l
['\x1c', '\x1d', '\x1e', '\x1f', '%', '\x85', '\xa0', '\u1680',
'\u2000', '\u2001', '\u2002', '\u2003', '\u2004', '\u2005', '\u2006',
'\u2007', '\u2008', '\u2009', '\u200a', '\u200b', '\u2028', '\u2029',
'\u202f', '\u205f', '\u3000']
>>> [bool(re.match('^\s$', char, re.ASCII)) for char in l]
[False, False, False, False, False, False, False, False, False, False,
False, False, False, False, False, False, False, False, False, False,
False, False, False, False, False]
>>> [bool(re.match('^\s$', char)) for char in l]
[True, True, True, True, False, True, True, True, True, True, True,
True, True,True, True, True, True, True, True, True, True, True, True,
True, True]

This bug is then related #5239 and the proposed fix should work for both.
We can close this as duplicate and include this problem in #5239.

Good work!

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5240>
_______________________________________


More information about the Python-bugs-list mailing list