[Python-Dev] [Csv] These csv test cases seem incorrect to me...
John Machin
sjmachin at lexicon.net
Mon Mar 12 05:13:25 CET 2007
On 12/03/2007 1:41 PM, Andrew McNamara wrote:
>
> The point was to produce the same results as Excel. Sure, Excel probably
> doesn't generate crap like this itself, but 3rd parties do, and people
> complain if we don't parse it just like Excel (sigh).
Let's put a little flesh on those a's and b's:
A typical example of the first case is where a database address line
contains a quoted house name e.g.
"Dunromin", 123 Main Street
and the producer of the CSV file has not done any quoting at all.
An example of the 2nd case is a database address line like this:
C/o Mrs Jones, "Dunromin", 123 Main Street
and the producer of the CSV file has merely wrapped quotes about it
without doubling the existing quotes, to emit this:
"C/o Mrs Jones, "Dunromin", 123 Main Street"
which Excel and adherents would distort to two fields containing:
'C/o Mrs Jones, Dunromin"' and ' 123 Main Street"' -- aarrgghh!!
People who complain as described are IMHO misguided; they are accepting
crap and losing data (yes, the quotes in the above examples are *DATA*).
Why should we heed their complaints?
Perhaps we could consider a non-default "dopey_like_Excel" option for
csv :-)
BTW, it is possible to do a reasonable recovery job when the producer's
protocol was to wrap quotes around the data without doubling existing
quotes, providing there were an even number of quotes to start with. It
just requires a quite different finite state machine.
Cheers,
John
More information about the Python-Dev
mailing list