[docs] [issue22232] str.splitlines splitting on non-\r\n characters

Terry J. Reedy report at bugs.python.org
Sun Aug 24 01:12:15 CEST 2014


Terry J. Reedy added the comment:

Glossary fixed. I changed the components to Documention as you will handle email elsewhere.

For library references: The key sentence currently used in all entries is "This method uses the universal newlines approach to splitting lines.", where *universal newlines* is linked to the glossary.

2.x has one entry for str and unicode. I propose to add "Unicode.splitlines also splits on '\x0b' ('\v'), '\x0c' ('\f'), '\x1c', '\x1d', '\x1e', '\x85', '\u2028', and '\u2029'." 

3.x bytes entry is good as is.

3.x str entry is wrong. Replace with "This method splits on universal newlines and also on '\x0b' ('\v'), '\x0c' ('\f'), '\x1c', '\x1d', '\x1e', '\x85', '\u2028', and '\u2029'." 

The docstrings now contain about the same as the docs, minus the key line above.
"   Return a list of the lines in S, breaking at line boundaries.
    Line breaks are not included in the resulting list unless keepends
    is given and true."

Between the sentences, I propose to add:
"Boundaries are indicated by 'universal newlines' ('\x0a' ('\n'), '\x0d' ('\r'), and '\x0d\x0a' ('\r\n'))." for bytes,
 with the addition of "and '\x0b' ('\v'), '\x0c' ('\f'), '\x1c', '\x1d', '\x1e', '\x85', '\u2028', and '\u2029'" for unicode.

----------
assignee:  -> docs at python
components: +Documentation -Library (Lib), Unicode, email
nosy: +docs at python
stage:  -> needs patch
versions: +Python 2.7

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue22232>
_______________________________________


More information about the docs mailing list