[CSV] Re: First Cut at CSV PEP

Dave Cole djc at object-craft.com.au
Wed Jan 29 01:24:17 CET 2003


>>>>> "Cliff" == Cliff Wells <LogiplexSoftware at earthlink.net> writes:

Cliff> On Tue, 2003-01-28 at 15:28, Dave Cole wrote:
>> I suppose that exporting should raise an exception if you specify
>> any variation on the dialect in the writer function.
>> 
>> csvwriter = csv.writer(file("newnastiness.csv", "w"), dialect='excel2000', delimiter='"')
>> 
>> That should raise an exception.

Cliff> I still don't see a good reason for this.  The programmer asked
Cliff> for it, let her do it.  I don't see a problem with letting the
Cliff> programmer shoot herself in the foot, as long as the gun
Cliff> doesn't start out pointing at it.

>> This probably shouldn't raise an exception though:
>> 
>> csvwriter = csv.writer(file("newnastiness.csv", "w"), dialect='excel2000')
>> csvwriter.setparams(delimiter='"')

Cliff> While this provides a workaround, it also seems a bit
Cliff> non-obvious why this should work when passing delimiter as an
Cliff> argument raises an exception.  I'm not dead-set against it, its
Cliff> JMHO.

I think you are right - it is a bad idea in retrospect.

Kevin> The CR, CR/LF, and LF line endings probably have something to
Kevin> do with saving in Mac format, but it may also do some 8-bit
Kevin> character translation.
>>  Should we be trying to handle unicode.  I think we should since
>> Python is now unicode capable.

Cliff> What issues is unicode support going to raise?

The low level parser (C code) is probably going to need to handle
unicode.

>> If it is not a newline, then it is not a newline.

Cliff> This seems like a particularly intractable problem.  If an file
Cliff> can't decide what sort of newlines it is going to use, then I'm
Cliff> not convinced it's the parser's problem.

Cliff> So the question becomes whether to except or pass through.  The
Cliff> two things to consider in this case are:

Cliff> 1) The data might be correct, in which case it should be passed
Cliff> through 2) The target for the data might be someone's
Cliff> mission-critical SQL server and we don't want to help them mung
Cliff> up their data.  An exception would seem appropriate.

Cliff> Frankly, I think I lean towards an exception on this one.
Cliff> There are enough text-processing tools available (dos2unix and
Cliff> kin) that someone should be able to pre-process a CSV file that
Cliff> is raising exceptions and get it into a form acceptable to the
Cliff> parser.  A little work up front is far more acceptable than
Cliff> putting out a fire on someone's database.

Should the reader have an option which turns on universal newline
mode?  This would allow for both behaviours - if a non-conforming
newline is encountered while not in universal newline mode then an
exception would be raised.

According to Andrew's previous message the default setting for Excel97
would be universal newline mode turned on.

- Dave

-- 
http://www.object-craft.com.au




More information about the Csv mailing list