[CSV] RE: First Cut at CSV PEP

Kevin Altis altis at semi-retired.com
Wed Jan 29 01:39:34 CET 2003


> From: Dave Cole
>
> >>>>> "Kevin" == Kevin Altis <altis at semi-retired.com> writes:
>
> Kevin> The big issue with the MS/Excel CSV format is that MS doesn't
> Kevin> appear to escape any characters or support import of escaped
> Kevin> characters. A field that contains characters that you might
> Kevin> normally escape (including a comma if that is the separator)
> Kevin> are instead enclosed in double quotes by default and then any
> Kevin> double quotes in the field are doubled.
>
> I thought that we were trying to build a CSV parser which would deal
> with different dialects, not just what Excel does.  Am I wrong making
> that assumption?
>
> If we were to only target Excel our task would be much easier.
>
> I think that we should be trying to come up with an engine wrapped by
> an friendly API which can be made more powerful over time in order to
> parse more and more dialects.

Agreed, certainly support more than just Excel. I think I understand the
dialects thing now. Last night I was getting rubbed the wrong way by
specifying the dialect and then also allowing the specification of
delimitter, quote character, etc. in the same line. I like the idea of using
a dialect and then changing the properties in separate calls.

I suppose there is a good reason that each dialect isn't just a subclass, if
so, the reasoning for using dialects instead of subclasses of a parser might
be called out in the PEP. I can go with it either way.

I would be tempted to call what is currently Excel2000, MSCSV or ExcelCSV.

ka




More information about the Csv mailing list