[Csv] Devil in the details, including the small one between delimiters and quotechars
Skip Montanaro
skip at pobox.com
Wed Jan 29 18:31:31 CET 2003
Cliff> Now consider
Cliff> 1,"not quoted" ,"quoted"
Cliff> Is the second field quoted or not? If it is, do we discard the
Cliff> extraneous whitespace following it or raise an exception?
Well, there's always the, "be flexible in what you accept, strict in what
you generate" school of thought. In the above, that would suggest the
list returned would be
['1', 'not quoted', 'quoted']
It seems like a minor formatting glitch. How about a warning? Or a "strict"
flag for the parser?
Cliff> Worse, consider this
Cliff> "quoted", "not quoted, but this ""field"" has delimiters and quotes"
Depends on the setting of skipinitialspaces. If false, you get
['quoted', ' "not quoted', ' but this ""field"" has delimiters and quotes"']
if True, I think you get
['quoted', 'not quoted, but this "field" has delimiters and quotes']
Cliff> How should this parse? I say free exceptions for everyone.
Cliff> While we're on the topic, I heard back from my DSV user who had
Cliff> mentioned this corner case of spaces between delimiters and
Cliff> quotes and he admitted that the files were created by hand, by
Cliff> him (figures), he seems to recall some now forgotten application
Cliff> that may have done this but wasn't sure. His memory was vague on
Cliff> whether he saw it on a PC or in a barn eating hay.
Don't you just love customers with concrete requirements? ;-)
Cliff> I propose space between delimiters and quotes raise an exception
Cliff> and let's be done with it. I don't think this really affects
Cliff> Excel compatibility since Excel will never generate this type of
Cliff> file and doesn't require it for import. It's true that some
Cliff> files that Excel would import (probably incorrectly) won't import
Cliff> in CSV, but I think that's outside the scope of Excel
Cliff> compatibility.
Sounds good to me.
Cliff> Anyway, I know no one has said "On your mark, get set" yet, but I
Cliff> can't think without code sitting in front of me, breaking worse
Cliff> with every keystroke, so in addition to creating some test cases,
Cliff> I've hacked up a very preliminary CSV module so we have something
Cliff> to play with. I was up til 6am so if there's anything odd, I
Cliff> blame it on lack of sleep and the feverish optimism and glossing
Cliff> of detail that comes with it.
Perhaps you and Dave were in a race but didn't know it? ;-)
Skip
_______________________________________________
Csv mailing list
Csv at mail.mojam.com
http://manatee.mojam.com/mailman/listinfo/csv
More information about the Csv
mailing list