[ python-Bugs-967934 ] csv module cannot handle embedded \r
SourceForge.net
noreply at sourceforge.net
Wed Apr 5 17:35:26 CEST 2006
Bugs item #967934, was opened at 2004-06-07 00:46
Message generated for change (Comment added) made by goodger
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=967934&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory Bond (gnbond)
Assigned to: Andrew McNamara (andrewmcnamara)
Summary: csv module cannot handle embedded \r
Initial Comment:
CSV module cannot handle the case of embedded \r (i.e.
carriage return) in a field.
As far as I can see, this is hard-coded into the _csv.c
file and cannot be fixed with Dialect changes.
----------------------------------------------------------------------
>Comment By: David Goodger (goodger)
Date: 2006-04-05 11:35
Message:
Logged In: YES
user_id=7733
I just filed a bug (http://www.python.org/sf/1465014) that
seems to be related to this. Revision 38290 on
Modules/_csv.c includes the addition of this code:
else if (c == '\n' || c == '\r') {
self->state = EAT_CRNL;
break;
}
(and similar). This seems to be eating (deleting) control
chars, but newlines used to be significant.
Embedded line breaks are allowed, according to RFC 4180
(http://www.ietf.org/rfc/rfc4180.txt). And according to the
Wikipedia entry
(http://en.wikipedia.org/wiki/Comma-separated_values), "a
line break within an element must be preserved."
----------------------------------------------------------------------
Comment By: Andrew McNamara (andrewmcnamara)
Date: 2005-01-13 06:34
Message:
Logged In: YES
user_id=698599
If you're interested, I've just checked in a change to the CVS head for
Python 2.5 that may, at least partially, fix this problem (if you try it, let me
know how it goes).
----------------------------------------------------------------------
Comment By: Skip Montanaro (montanaro)
Date: 2004-06-07 07:25
Message:
Logged In: YES
user_id=44345
It certainly intersects with it somehow. ;-) If nothing else, it
will serve as a useful test case.
----------------------------------------------------------------------
Comment By: Andrew McNamara (andrewmcnamara)
Date: 2004-06-07 01:32
Message:
Logged In: YES
user_id=698599
I suspect this restriction (CR appearing within a quoted
field) is a historical accident and can be safely removed.
----------------------------------------------------------------------
Comment By: Raymond Hettinger (rhettinger)
Date: 2004-06-07 01:02
Message:
Logged In: YES
user_id=80475
Skip, does this coincide with your planned switchover to
universal newlines?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=967934&group_id=5470
More information about the Python-bugs-list
mailing list