[New-bugs-announce] [issue36502] The behavior of str.isspace() for U+00A0 and U+202F is different from what is documented

Jun report at bugs.python.org
Tue Apr 2 02:36:07 EDT 2019


New submission from Jun <jkuroda.isa at gmail.com>:

I was looking for a list of Unicode codepoints that str.isspace() returns true.

According to https://docs.python.org/3/library/stdtypes.html#str.isspace, it's 
"Whitespace characters are those characters defined in the Unicode character database as “Other” or “Separator” and those with bidirectional property being one of “WS”, “B”, or “S”."

However, for U+202F(https://www.fileformat.info/info/unicode/char/202f/index.htm) which is a "Separator" and its bidirectional property is "CS", str.isspace() returns True while it shouldn't if we follow the definition above. 

>>> "\u202f".isspace()
True

I'm not sure either the documentation should be updated or behavior should be updated, but at least those should be consistent.

----------
assignee: docs at python
components: Documentation, Unicode
messages: 339317
nosy: Jun, docs at python, ezio.melotti, vstinner
priority: normal
severity: normal
status: open
title: The behavior of str.isspace() for U+00A0 and U+202F is different from what is documented
type: behavior
versions: Python 2.7, Python 3.5, Python 3.6

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36502>
_______________________________________


More information about the New-bugs-announce mailing list