
On Oct 23, 2019, at 16:26, David Mertz <mertz@gnosis.cx> wrote:
To be fair, I also don't know which of those split on str.split() with no arguments to the method either.
I would assume the rule is the same rule used by str.isspace, and that this rule is either the simple one (category is Zs) or the full one (category is Zs or bidi class is one of the handful of bidi space classes) from the same version of Unicode that the unicodedata module handles. In fact, it’s more than an assumption—if it isn’t true, I’d expect to find a good rationale in the docs, or it’s probably a bug in the str class. You can’t document something as a method of Unicode strings that splits on “whitespace” using anything other than a Unicode definition of whitespace is without a good reason.