denis.spir@...> writes:
I would just add: The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable
space).
Then the proposal should allow for any kind of space characters (that is, any character for which isspace() is True). There are several non-breaking space characters in the unicode character set, with varying character widths, which is important for typography rules. See http://en.wikipedia.org/wiki/Non-breaking_space for some examples.
Regards
Antoine (playing devil's advocate a bit - but only a bit).
Keeping in mind the needs of people writing parsers, I don't think it's a good idea to expand this set. Already, we're not supporting all possible separators whether they be spaces or not. Given just U+0020 and U+00A0, a person can easily do a str.replace() to get to anything else. Raymond