[New-bugs-announce] [issue6664] readlines should understand Line Separator and Paragraph Separator characters
report at bugs.python.org
Fri Aug 7 11:14:15 CEST 2009
New submission from Neil Hodgson <nyamatongwe at users.sourceforge.net>:
Unicode includes Line Separator U+2028 and Paragraph Separator U+2029
line ending characters. The readlines method of the file object returned
by the built-in open does not treat these characters as line ends
although the object returned by codecs.open(..., encoding='utf-8') does.
The attached program creates a UTF-8 file containing three lines with
the second line ended with a Paragraph Separator. The program then reads
this file back in as a text file. Only two lines are seen when reading
the file back in.
The desired behaviour is for the file to be read in as three lines.
title: readlines should understand Line Separator and Paragraph Separator characters
versions: Python 3.1
Added file: http://bugs.python.org/file14671/lineends.py
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce