[CSV] Re: First Cut at CSV PEP

Skip Montanaro skip at pobox.com
Wed Jan 29 03:21:20 CET 2003


    Dave> The point of the 'raw' dialect is to expose the full capabilities
    Dave> of the raw parser.  Maybe we should use None rather than 'raw'.

Nah, "raw" won't mean anything to anyone.  Make "excel2000" the default.
The point of the dialect names is that they should mean something to
someone.  That generally means application names, not something lile "raw".
I think it also means you only have variants associated with applications
which normally provide few choices.  We can probably all come close to
specifying what the parameter settings are for "excel2000", but what about
"gnumeric"?  As I write this I'm looking at a Gnumeric "Save As" wizard.
The user can choose line termination (LF is the default), delimiter (comma
is the default), quoting style (automatic (default), always, never), and the
quote character (" is the default).  Even though the wizard presents
sensible defaults, I'm less enthusiastic about creating a "gnumeric"
variant, precisely because it won't necessarily mean much.

    Cliff> I think it is an option to save as a TSV file (IIRC), which is
    Cliff> the same as a CSV file, but with tabs.

    Dave> Hmm...  What would be the best way to handle Excel TSV.  Maybe a
    Dave> new dialect 'excel-tsv'?

Any of:

    reader = csv.reader(file("some.csv"), variant="excel2000-tsv")

or

    reader = csv.reader(file("some.csv"), variant="excel2000",
                        delimiter='\t')

or (assuming "excel2000" is the default), just:

    reader = csv.reader(file("some.csv"), delimiter='\t')

    Dave> I am not saying that the wrapper should absolutely prevent someone
    Dave> from using options not available in the application.  If you want to
    Dave> break the dialect then maybe it should be a two step process.

    Dave>     csvwriter = csv.writer(file("newnastiness.csv", "w"),
    Dave>                            dialect='excel2000')
    Dave>     csvwriter.setparams(delimiter='"')

That seems cumbersome.  I think we have to give our users both some credit
(for brains) and some flexibility.  It seems gratuitous (and unPythonic) to
specify some parameters in the constructor and some in a later method.

All this dialect stuff will be handled at the Python level, right?

Skip



More information about the Csv mailing list