First Cut at CSV PEP

Skip Montanaro skip at pobox.com
Tue Jan 28 22:55:12 CET 2003


    Kevin> Probably need to specify that input and output deals with string
    Kevin> representations, but there are some differences:

    Kevin> [[5,'Bob',None,1.0]]

    Kevin> DSV.exportCSV produces

    Kevin> '5,Bob,None,1.0'

I'm not so sure this mapping None to "None" on output is such a good idea
because it's not reversible in all situations and hurts portability to other
systems (e.g., does Excel have a concept of None? what happens if you have a
text field which just happens to contain "None"?).  I think we need to limit
the data which can be output to strings, Unicode strings (if we use an
encoded stream), floats and ints.  Anything else should raise TypeError.

    Kevin> I'm still not sure about some of the database CSV handling
    Kevin> issues, often it seems they want a string field to be quoted
    Kevin> regardless of whether it contains a comma or newlines, but number
    Kevin> and empty field should not be quoted. It is certainly nice to be
    Kevin> able to import a file that contains

    Kevin> 5,"Bob",,1.0\r\n

    Kevin> and not need to do any further translation. Excel appears to
    Kevin> interpret quoted numbers and unquoted numbers as numeric fields
    Kevin> when importing.

I like my CSV files to be fully quoted (even fields which may contain
numbers), largely because it makes later (dangerous) matching using regular
expressions simpler.  Otherwise I wind up having to make all the quotes in
the regular expressions optional.  It just complicates things.

    Kevin> Just trying to be anal-retentive here to make sure all the issues
    Kevin> are covered ;-)

I hear ya.

I just did a little fiddling in Excel 2000 with some simple values.  When I
save as CSV, it doesn't give me the option to change the delimiter or quote
character.  Nor could I figure out how to embed a newline in a cell.  It
certainly doesn't seem as flexible as Gnumeric in this regard.  Can someone
provide me with some hints?

Attached is a slight modification of the proto-PEP.  Really all that's
changed is the list of issues has grown.

Thx,

Skip

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/octet-stream
Size: 7138 bytes
Desc: not available
Url : http://mail.python.org/pipermail/csv/attachments/20030128/ce8a1d53/attachment.obj 


More information about the Csv mailing list