[Csv] Re: csv bugs
Skip Montanaro
skip at pobox.com
Tue Mar 2 16:23:38 CET 2004
(A better place for this discussion would probably be csv at mail.mojam.com.
I'm adding it to the cc list.)
Magnus> It seems that when a line termination is escaped (using the
Magnus> current escape character), csv.reader treats it as a line
Magnus> continuation, which is well an good -- but it doesn't discard
Magnus> the escape character; instead, it escapes it implicitly. This
Magnus> seems like a bug to me. E.g.
Magnus> foo:bar:baz\
Magnus> frozz:bozz
Magnus> with separator ':' and escape character '\\' is parsed into
Magnus> ['foo', 'bar', 'baz\\\nfrozz', 'bozz']
Magnus> In my opinion, it *ought* to be parsed into
Magnus> ['foo', 'bar', 'baz\nfrozz', 'bozz']
Magnus> As far as I know, this is the UNIX convention, as used in (e.g.)
Magnus> /etc/passwd.
That may be, however development of the csv module's parser was driven by
how Microsoft Excel behaves. The assumption was (rightly I think) that
Excel reads or writes more CSV files than anything else. I don't believe it
does anything with backslashes.
Magnus> Am I off target here? If the current behaviour is desirable
Magnus> (although I can't see why it should be) then at least I think
Magnus> there should be a way of implementing "normal" line
Magnus> continuations (as in my example), which is the standard UNIX
Magnus> behavior, and the behavior of Python source, for that
Magnus> matter. Otherwise, csv can't be used to parse (e.g.)
Magnus> /etc/passwd...
You're welcome to submit a patch. I don't have time for it.
Magnus> And another thing: Perhaps a 'passwd' dialect could be added
Magnus> alongside 'excel'? Something like:
Magnus> class passwd(Dialect):
Magnus> delimiter = ':'
Magnus> doublequote = False
Magnus> escapechar = '\\'
Magnus> lineterminator = '\n'
Magnus> quotechar = '?'
Magnus> quoting = QUOTE_NONE
Magnus> skipinitialspace = False
Magnus> register_dialect("passwd", passwd)
I'll take a look at that.
Magnus> For some reason you *have* to supply a quotechar, even if you
Magnus> set QUOTE_NONE... I guess that's a bug too, in my book.
Maybe. Maybe just a feature.
Magnus> If there are no objections, I might submit some of this as a bug
Magnus> report or two (or even a patch).
Please do.
Skip
More information about the Csv
mailing list