[Csv] Sniffer empty delimiter
skip at pobox.com
skip at pobox.com
Thu Dec 29 01:07:50 CET 2005
Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> d = csv.Sniffer().sniff('a|b|c|d|e', ['\t', ','])
>>> d.delimiter
''
>>> d = csv.Sniffer().sniff('a|b|c|d|e')
>>> d.delimiter
'a'
Both of these seem wrong to me at some level. I tend to agree with you that
if the delimiter fails it should raise an exception, certainly if the
delimiters argument defines a set of characters from which the actual
delimiter must be chosen (does it?). The second has to be considered a bug
doesn't it?
John> (1) IMHO it should *NEVER* return an alphabetic or numeric
John> character as the delimiter.
Probably a good rule of thumb.
John> (2) If there is insufficient sample to determine the dialect's
John> attributes, then it shouldn't pluck them out of the air, with
John> no indication to the caller that there might be a problem. IOW
John> I don't like the "remedies" of "return standard delimiter" and
John> "return first delimiter". It should raise csv.Error; the
John> discerning caller can then take appropriate action.
If I have a csv file that happens to only have one column and I'm using the
sniffer (presumably because I have an app that processes somewhat arbitrary
csv files) I'd hate for it to fail in that one case. For that case maybe we
can define an optional default arg that is a single character. Failing all
other tests, the default is returned.
John> (3) Some documentation on how the 2nd arg is used would be a good
John> idea, as would be an explanation of the relationship with the
John> undocumented "preferred" attribute:
Agreed. I seem to recall you're the author. Got some text? <wink>
>>> csv.Sniffer().preferred
[',', '\t', ';', ' ', ':']
John> (4) Too late to change now, but having a class with no args to its
John> constructor and only one other method has a whiff of some
John> other language :-)
It's not too late to add an optional preferred arg to the constructor.
John> (5) But the doco is not correct, there are 2 non-constructor
John> methods:
Yeah, I already noticed and fixed that. That was easy. ;-)
Skip
More information about the Csv
mailing list