First Cut at CSV PEP

Tue Jan 28 05:56:39 CET 2003

>>>>> "Skip" == Skip Montanaro <skip at pobox.com> writes:

Skip> I'm ready to toddle off to bed, so I'm stopping here for
Skip> tonight.  Attached is what I've come up with so far in the way
Skip> of a PEP.  Feel free to flesh out, rewrite or add new sections.
Skip> After a brief amount of cycling, I'll check it into CVS.

I only have one issue with the PEP as it stands.  It is still aiming
too low.  One of the things that we support in our parser is the
ability to handle CSV without quote characters.

        field1,field2,field3\, field3,field4

One of our customers has data like the above.  To handle this we would
need something like the following:

    # Use the 'raw' dialect to get access to all tweakables.
    writer(fileobj,
           dialect='raw', quotechar=None, delimiter=',', escapechar='\\')

I think that we need some way to handle a potentially different set of
options on each dialect.

When you CSV export from Excel, do you have the ability to use a
delimiter other than comma?  Do you have the ability to change the
quotechar?

Should the wrapper protect you from yourself so that when you select
the Excel dialect you are limited to the options available within
Excel?

Maybe the dialect should not limit you, it should just provide the
correct defaults.

Since we are going to have one parsing engine in an extension module
below the Python layer, we are probably going to evolve more tweakable
settings in the parser over time.  It would be nice if we could hide
new tweakables from application code by associating defaults values
with dialect names in the Python layer.  We should not be exposing the
low level parser interface to user code if it can be avoided.

- Dave

-- 
http://www.object-craft.com.au