First Cut at CSV PEP

Dave Cole djc at object-craft.com.au
Wed Jan 29 00:59:44 CET 2003


>>>>> "Skip" == Skip Montanaro <skip at pobox.com> writes:

Kevin> Probably need to specify that input and output deals with
Kevin> string representations, but there are some differences:

Kevin> [[5,'Bob',None,1.0]]

Kevin> DSV.exportCSV produces

Kevin> '5,Bob,None,1.0'

Skip> I'm not so sure this mapping None to "None" on output is such a
Skip> good idea because it's not reversible in all situations and
Skip> hurts portability to other systems (e.g., does Excel have a
Skip> concept of None? what happens if you have a text field which
Skip> just happens to contain "None"?).

I think that None should always be written as a zero length field, and
always read as the field value 'None'

Skip> I think we need to limit the data which can be output to
Skip> strings, Unicode strings (if we use an encoded stream), floats
Skip> and ints.  Anything else should raise TypeError.

Is there any merit having the writer handling non-string data by
producing an empty field for None, and the result of PyObject_Str()
for all other values?

Skip> I like my CSV files to be fully quoted (even fields which may
Skip> contain numbers), largely because it makes later (dangerous)
Skip> matching using regular expressions simpler.  Otherwise I wind up
Skip> having to make all the quotes in the regular expressions
Skip> optional.  It just complicates things.

That raises another implementation issue.  If you export from Excel,
does it always quote fields?  If not then the default dialect
behaviour should not unconditionally quote fields.

We could/should support mandatoryquote as a writer option.

I am going to spend some time tonight seeing if I can fold all of my
ideas into the PEP so you can all poke holes in it.

- Dave

-- 
http://www.object-craft.com.au




More information about the Csv mailing list