[Csv] Devil in the details, including the small one between delimiters and quotechars

Skip Montanaro skip at pobox.com
Wed Jan 29 18:31:31 CET 2003


    Cliff> Now consider

    Cliff> 1,"not quoted" ,"quoted"

    Cliff> Is the second field quoted or not?  If it is, do we discard the
    Cliff> extraneous whitespace following it or raise an exception?

Well, there's always the, "be flexible in what you accept, strict in what
you generate" school of thought.  In the above, that would suggest the
list returned would be

    ['1', 'not quoted', 'quoted']

It seems like a minor formatting glitch.  How about a warning?  Or a "strict"
flag for the parser?

    Cliff> Worse, consider this

    Cliff> "quoted", "not quoted, but this ""field"" has delimiters and quotes"

Depends on the setting of skipinitialspaces.  If false, you get

    ['quoted', ' "not quoted', ' but this ""field"" has delimiters and quotes"']

if True, I think you get

    ['quoted', 'not quoted, but this "field" has delimiters and quotes']

    Cliff> How should this parse?  I say free exceptions for everyone.

    Cliff> While we're on the topic, I heard back from my DSV user who had
    Cliff> mentioned this corner case of spaces between delimiters and
    Cliff> quotes and he admitted that the files were created by hand, by
    Cliff> him (figures), he seems to recall some now forgotten application
    Cliff> that may have done this but wasn't sure.  His memory was vague on
    Cliff> whether he saw it on a PC or in a barn eating hay.

Don't you just love customers with concrete requirements? ;-)

    Cliff> I propose space between delimiters and quotes raise an exception
    Cliff> and let's be done with it.  I don't think this really affects
    Cliff> Excel compatibility since Excel will never generate this type of
    Cliff> file and doesn't require it for import.  It's true that some
    Cliff> files that Excel would import (probably incorrectly) won't import
    Cliff> in CSV, but I think that's outside the scope of Excel
    Cliff> compatibility.

Sounds good to me.

    Cliff> Anyway, I know no one has said "On your mark, get set" yet, but I
    Cliff> can't think without code sitting in front of me, breaking worse
    Cliff> with every keystroke, so in addition to creating some test cases,
    Cliff> I've hacked up a very preliminary CSV module so we have something
    Cliff> to play with.  I was up til 6am so if there's anything odd, I
    Cliff> blame it on lack of sleep and the feverish optimism and glossing
    Cliff> of detail that comes with it.

Perhaps you and Dave were in a race but didn't know it? ;-)

Skip
_______________________________________________
Csv mailing list
Csv at mail.mojam.com
http://manatee.mojam.com/mailman/listinfo/csv



More information about the Csv mailing list