Is this the same code points identified by `str.isspace`?
I haven't checked -- so I will:
and the answer is no:
$ python weird_spaces.py
x x x xx x x x x x x x x x x xx x x xx
['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x\u200bx', 'x', 'x', 'x\ufeffx']
41
18
[False, True, False, True, False, True, False, False, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, False, False, True, False, True, False, True, False, False, False]
There are only three that didn't split, but many more than three that failed .isspace.
Thanks for doing that. I would have soon otherwise. Still, "most of them" isn't actually a precise answer for an uncertain string. :-)
nope.
But it could be defined somewhere, and presumably is, though maybe not consistently.
-CHB
> To be fair, I also don't know which of those split on str.split() with no arguments to the method either.
I couldn't resist -- the answer is most of them:
#!/usr/bin/env python
weird_spaces = ("x\u0020x\u00A0x\u1680x\u180Ex\u2000x\u2001x\u2002"
"x\u2003x\u2004x\u2005x\u2006x\u2007x\u2008x\u2009"
"x\u200Ax\u200Bx\u202Fx\u205Fx\u3000x\uFEFFx")
print(weird_spaces)
splitted = weird_spaces.split()
print(splitted)
print(len(weird_spaces))
print(len(splitted))
$ python weird_spaces.py
x x x xx x x x x x x x x x x xx x x xx
['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x\u200bx', 'x', 'x', 'x\ufeffx']
41
18
-CHB
--
Christopher Barker, PhD
Python Language Consulting
- Teaching
- Scientific Software Development
- Desktop GUI and Web Development
- wxPython, numpy, scipy, Cython