[ python-Bugs-1175396 ] codecs.readline sometimes removes newline chars

SourceForge.net noreply at sourceforge.net
Fri Apr 15 00:04:20 CEST 2005


Bugs item #1175396, was opened at 2005-04-02 17:14
Message generated for change (Comment added) made by mmm
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1175396&group_id=5470

Category: Python Library
Group: Python 2.4
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Irmen de Jong (irmen)
Assigned to: Walter Dörwald (doerwalter)
Summary: codecs.readline sometimes removes newline chars

Initial Comment:
In Python 2.4.1 i observed a new bug in codecs.readline,
it seems that with certain inputs it removes newline
characters from the end of the line....

Probably related to bug #1076985 (Incorrect
behaviour of StreamReader.readline leads to crash)
and bug #1098990 codec readline() splits lines apart
(both with status closed) so I'm assigning this to Walter.

See the attached files that demonstrate the problem.
Reproduced with Python 2.4.1 on windows XP and on
Linux. The problem does not occur with Python 2.4.

(btw, it seems bug #1076985 was fixed in python 2.4.1,
but the other one (#1098990) not? )

----------------------------------------------------------------------

Comment By: Michal Rydlo (mmm)
Date: 2005-04-15 00:04

Message:
Logged In: YES 
user_id=65460

foo2.py from #1163244 fails to import. Not being expert in
Python internals I hope it is due to this bug.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2005-04-11 23:42

Message:
Logged In: YES 
user_id=89016

OK, I'm reopening to bug report. I didn't manage to install
pythondoc. cElementTree complains about: No such file or
directory: './pyconfig.h'. Can you provide a simple Python
file that fails when imported?

----------------------------------------------------------------------

Comment By: Greg Chapman (glchapman)
Date: 2005-04-10 00:47

Message:
Logged In: YES 
user_id=86307

Sorry to comment on a closed report, but perhaps this fix
should not be limited only to cases where size is None. 
Today, I ran into a spurious syntax error when trying to
import pythondoc (from
http://effbot.org/downloads/pythondoc-2.1b3-20050325.zip). 
It turned out a \r was ending up in what looked to the
parser like the middle of a line, presumably because a \n
was dropped.  Anyway, I applied the referenced patch to
2.4.1, except I left out the "size is None" condition, since
I knew the tokenizer passes in a size param.  With that
change pythondoc import successfully.  (Also, I just ran the
test suite and nothing broke.)

Since the size parameter is already documented as being
passed to StreamReader.read (in codecs.py -- the HTML
documentation needs to be updated), and since
StreamReader.read says size is an approximate maximum,
perhaps it's OK to read one extra byte.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2005-04-05 00:01

Message:
Logged In: YES 
user_id=89016

Checked in a fix as:
Lib/codecs.py 1.42/1.43
Lib/test/test_codecs.py 1.22
Lib/codecs.py 1.35.2.6
Lib/test/test_codecs.py 1.15.2.4

Are you really sure, that the fix for #1098990 is not in
2.4.1? According to the tracker for #1098990 the fix was in
lib/codecs.py revision 1.35.2.2 and revision 1.35.2.3 is the
one that got the r241c1 tag, so the fix should be in 2.4.1.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1175396&group_id=5470


More information about the Python-bugs-list mailing list