[CSV] Re: First Cut at CSV PEP
Dave Cole
djc at object-craft.com.au
Wed Jan 29 02:21:42 CET 2003
>>>>> "Cliff" == Cliff Wells <LogiplexSoftware at earthlink.net> writes:
Cliff> On Tue, 2003-01-28 at 16:47, Dave Cole wrote:
>> >>>>> "Cliff" == Cliff Wells <LogiplexSoftware at earthlink.net>
>> writes:
>>
>> >> Instead of limiting the tweakable options by raising an
>> exception >> we could have an interface which allowed the user to
>> query the >> options normally associated with a dialect.
>> >>
>> >> >> Hmm... What would be the best way to handle Excel TSV.
>> Maybe a >> >> new dialect 'excel-tsv'?
>>
Cliff> So are we leaning towards dialects being done as simple
Cliff> classes? Will 'excel-tsv' simply be defined as
>>
Cliff> class excel_tsv(excel_2000): delimiter = '\t'
>>
Cliff> with a dictionary for lookup:
>>
Cliff> settings = { 'excel-tsv': excel_tsv, 'excel-2000': excel_2000,
Cliff> }
>> Dunno yet.
>>
>> Here we go again with a potentially bad idea...
>>
>> I think that there are two things we need to have for each dialect;
>> a set of low level parser configuration, and a set of user
>> tweakables (which correspond to options presented by the
>> application). The set of user tweakables may not necessarily map
>> one-to-one with low level parser configuration items.
Cliff> Can you give examples? I suppose you are referring to things
Cliff> like CR/LF translation and spaces around quotes as being
Cliff> low-level parser configurations and things like delimiters
Cliff> being user-tweakable?
I do not have access to the software at the moment, but not long ago I
used a program called TOAD which was a GUI for fiddling around with
Oracle as a client. One of the things you could after executing a
query was export the results to a file. I seem to recall that the
export dialog has a number of options which do not cleanly map onto
just one of the settings we would place in our writer/reader.
I will see if I can get a screen shot of the dialog...
Cliff> Maybe. Currently the sniffing code in DSV just makes a best
Cliff> guess regarding delimiters, text qualifiers and headers.
Cliff> Certainly the dialects could be used to improve its guess (most
Cliff> likely when the sniffed results are ambiguous or fail).
Cliff> Using dialects on import is of less importance if sniffing code
Cliff> is used. They are two different approaches to the same
Cliff> problem. If the user specifies the file as Excel compatible,
Cliff> then sniffing seems rather redundant, further, if the file is
Cliff> sniffed and the format discovered, it doesn't seem important
Cliff> which dialect it matches, as long as we are able to use the
Cliff> sniffed parameters to parse it.
The sniffer is definitely your area of expertise. I am just making
stuff up as I go :-)
- Dave
--
http://www.object-craft.com.au
More information about the Csv
mailing list