[Csv] What's our status?

Cliff Wells LogiplexSoftware at earthlink.net
Wed Feb 26 23:10:48 CET 2003


On Wed, 2003-02-26 at 09:32, Cliff Wells wrote:

> I'm working on csvutils.py right now.  The guessDelimiter() function
> from DSV isn't really the best for our purposes as it expects a fairly
> fixed number of columns and we're allowing for variable columns per row.
> Also, allowing spaces around delimiters is going to throw
> guessQuoteChar().  I've got some ideas for fixing guessQuoteChar() but
> guessDelimiter is going to need an entirely new approach (which I think
> I have an idea for =)

Okay, here's my status:

1) I can sniff the quotechar.
2) I can sniff the delimiter IF:
    a) there is a quotechar [determine delimiter based on relation to 
       quotechar].
       or
    b) the data is regular, that is, the number of columns doesn't vary
       a lot from record to record [based upon number of occurrences of 
       delimiter in each record, to grossly simplify things].  This is  
       the method DSV uses.

However, for the following I am so far unable to come up with a way to
determine the delimiter:

all,work,and,no,play,makes,jack,a,dull,boy
all,work,and,no,play,makes,jack,a,dull
boy
all,work,and,no,play,makes,jack,a
dull,boy
all,work,and,no,play,makes,jack
a,dull,boy
all,work,and,no,play,makes
jack,a,dull,boy
all,work,and,no,play
makes,jack,a,dull,boy
all,work,and,no
play,makes,jack,a,dull,boy
all,work,and
no,play,makes,jack,a,dull,boy

Anyone have a suggestion?  All work and no play makes jack a dull boy.


-- 
Cliff Wells, Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308  (800) 735-0555 x308



More information about the Csv mailing list