[Csv] What's our status?
Cliff Wells
LogiplexSoftware at earthlink.net
Wed Feb 26 23:10:48 CET 2003
On Wed, 2003-02-26 at 09:32, Cliff Wells wrote:
> I'm working on csvutils.py right now. The guessDelimiter() function
> from DSV isn't really the best for our purposes as it expects a fairly
> fixed number of columns and we're allowing for variable columns per row.
> Also, allowing spaces around delimiters is going to throw
> guessQuoteChar(). I've got some ideas for fixing guessQuoteChar() but
> guessDelimiter is going to need an entirely new approach (which I think
> I have an idea for =)
Okay, here's my status:
1) I can sniff the quotechar.
2) I can sniff the delimiter IF:
a) there is a quotechar [determine delimiter based on relation to
quotechar].
or
b) the data is regular, that is, the number of columns doesn't vary
a lot from record to record [based upon number of occurrences of
delimiter in each record, to grossly simplify things]. This is
the method DSV uses.
However, for the following I am so far unable to come up with a way to
determine the delimiter:
all,work,and,no,play,makes,jack,a,dull,boy
all,work,and,no,play,makes,jack,a,dull
boy
all,work,and,no,play,makes,jack,a
dull,boy
all,work,and,no,play,makes,jack
a,dull,boy
all,work,and,no,play,makes
jack,a,dull,boy
all,work,and,no,play
makes,jack,a,dull,boy
all,work,and,no
play,makes,jack,a,dull,boy
all,work,and
no,play,makes,jack,a,dull,boy
Anyone have a suggestion? All work and no play makes jack a dull boy.
--
Cliff Wells, Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308 (800) 735-0555 x308
More information about the Csv
mailing list