[ python-Bugs-1465014 ] CSV regression in 2.5a1: multi-line cells
SourceForge.net
noreply at sourceforge.net
Thu Jun 22 20:17:35 CEST 2006
Bugs item #1465014, was opened at 2006-04-05 11:14
Message generated for change (Comment added) made by goodger
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1465014&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.5
Status: Open
Resolution: None
Priority: 9
Submitted By: David Goodger (goodger)
Assigned to: Andrew McNamara (andrewmcnamara)
Summary: CSV regression in 2.5a1: multi-line cells
Initial Comment:
Running the attached csv_test.py under Python 2.4.2
(Windows XP SP1) produces:
>c:\apps\python24\python.exe ./csv_test.py
['one', '2', 'three (line 1)\n(line 2)']
Note that the third item in the row contains a newline
between "(line 1)" and "(line 2)".
With Python 2.5a1, I get:
>c:\apps\python25\python.exe ./csv_test.py
['one', '2', 'three (line 1)(line 2)']
Notice the missing newline, which is significant. The
CSV module under 2.5a1 seems to lose data.
----------------------------------------------------------------------
>Comment By: David Goodger (goodger)
Date: 2006-06-22 14:17
Message:
Logged In: YES
user_id=7733
I see what you're saying, but I disagree. In Python 2.4,
csv.reader did not require newlines, but in Python 2.5 it
does. That's a significant behavioral change. In the
stdlib csv "Module Contents" docs for csv.reader, it says:
"csvfile can be any object which supports the iterator
protocol and returns a string each time its next method is
called." It doesn't mention newline-terminated strings.
In any case, the behavior is inconsistent: newlines are not
required to terminate row-ending strings, but only strings
which end inside cells split across rows. Why the discrepancy?
----------------------------------------------------------------------
Comment By: Andrew McNamara (andrewmcnamara)
Date: 2006-06-20 19:17
Message:
Logged In: YES
user_id=698599
I think your problem is with str.splitlines(), rather than
the csv.reader: splitlines ate the newline. If you pass it
True as an argument, it will retain the end-of-line
character in the resulting strings.
----------------------------------------------------------------------
Comment By: David Goodger (goodger)
Date: 2006-05-02 17:04
Message:
Logged In: YES
user_id=7733
Assigned to Andrew McNamara, since his change appears to
have caused this regression (revision 38290 on
Modules/_csv.c).
----------------------------------------------------------------------
Comment By: David Goodger (goodger)
Date: 2006-05-02 16:58
Message:
Logged In: YES
user_id=7733
Further investigation has revealed that the regression only
affects iterator I/O, not file I/O. The attached
csv_test.py demonstrates. Run with Python 2.5 to get:
results from file I/O:
[['one', '2', 'three (line 1)\n(line 2)']]
results from iterator I/O:
[['one', '2', 'three (line 1)(line 2)']]
----------------------------------------------------------------------
Comment By: David Goodger (goodger)
Date: 2006-04-05 11:44
Message:
Logged In: YES
user_id=7733
This bug seems to be a side effect of revision 38290 on
Modules/_csv.c, which was prompted by bug 967934
(http://www.python.org/sf/967934).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1465014&group_id=5470
More information about the Python-bugs-list
mailing list