[Csv] Sniffer empty delimiter

skip at pobox.com skip at pobox.com
Thu Dec 29 01:07:50 CET 2005


    Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import csv
    >>> d = csv.Sniffer().sniff('a|b|c|d|e', ['\t', ','])
    >>> d.delimiter
    ''
    >>> d = csv.Sniffer().sniff('a|b|c|d|e')
    >>> d.delimiter
    'a'

Both of these seem wrong to me at some level.  I tend to agree with you that
if the delimiter fails it should raise an exception, certainly if the
delimiters argument defines a set of characters from which the actual
delimiter must be chosen (does it?).  The second has to be considered a bug
doesn't it?

    John> (1) IMHO it should *NEVER* return an alphabetic or numeric
    John>     character as the delimiter.

Probably a good rule of thumb.

    John> (2) If there is insufficient sample to determine the dialect's
    John>     attributes, then it shouldn't pluck them out of the air, with
    John>     no indication to the caller that there might be a problem. IOW
    John>     I don't like the "remedies" of "return standard delimiter" and
    John>     "return first delimiter". It should raise csv.Error; the
    John>     discerning caller can then take appropriate action.

If I have a csv file that happens to only have one column and I'm using the
sniffer (presumably because I have an app that processes somewhat arbitrary
csv files) I'd hate for it to fail in that one case.  For that case maybe we
can define an optional default arg that is a single character.  Failing all
other tests, the default is returned.

    John> (3) Some documentation on how the 2nd arg is used would be a good
    John>     idea, as would be an explanation of the relationship with the
    John>     undocumented "preferred" attribute:

Agreed.  I seem to recall you're the author.  Got some text? <wink>

    >>> csv.Sniffer().preferred
    [',', '\t', ';', ' ', ':']

    John> (4) Too late to change now, but having a class with no args to its
    John>     constructor and only one other method has a whiff of some
    John>     other language :-)

It's not too late to add an optional preferred arg to the constructor.

    John> (5) But the doco is not correct, there are 2 non-constructor
    John>     methods:

Yeah, I already noticed and fixed that.  That was easy. ;-)

Skip



More information about the Csv mailing list