
On Wed, Oct 23, 2019 at 6:04 PM David Mertz <mertz@gnosis.cx> wrote:
Is this the same code points identified by `str.isspace`?
I haven't checked -- so I will: and the answer is no: $ python weird_spaces.py x x x xx x x x x x x x x x x xx x x xx ['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x\u200bx', 'x', 'x', 'x\ufeffx'] 41 18 [False, True, False, True, False, True, False, False, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, False, False, True, False, True, False, True, False, False, False] There are only three that didn't split, but many more than three that failed .isspace. Thanks for doing that. I would have soon otherwise. Still, "most of them"
isn't actually a precise answer for an uncertain string. :-)
nope. But it could be defined somewhere, and presumably is, though maybe not consistently. -CHB On Wed, Oct 23, 2019, 8:57 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Wed, Oct 23, 2019 at 5:53 PM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
To be fair, I also don't know which of those split on str.split() with no arguments to the method either.
I couldn't resist -- the answer is most of them:
#!/usr/bin/env python weird_spaces = ("x\u0020x\u00A0x\u1680x\u180Ex\u2000x\u2001x\u2002" "x\u2003x\u2004x\u2005x\u2006x\u2007x\u2008x\u2009" "x\u200Ax\u200Bx\u202Fx\u205Fx\u3000x\uFEFFx") print(weird_spaces) splitted = weird_spaces.split() print(splitted)
print(len(weird_spaces)) print(len(splitted))
$ python weird_spaces.py x x x xx x x x x x x x x x x xx x x xx ['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x\u200bx', 'x', 'x', 'x\ufeffx'] 41 18
-CHB
-- Christopher Barker, PhD
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython