DSVWizard.py
Cliff Wells
LogiplexSoftware at earthlink.net
Mon Jan 27 21:05:49 CET 2003
On Sun, 2003-01-26 at 21:08, Dave Cole wrote:
> > > DSV: ['Test 1', 'Fred said "hey!", and left the room', '']
> > > csv: ['Test 1', ' "Fred said ""hey!""', ' and left the room"', ' ""']
> >
> > IMO, Dave's is incorrect in this one (unless he has specific reasons
> > otherwise).
>
> Andrew (who has been included on th Cc) has tested the behaviour of
> Excel (such as it is) and we do the same thing as Excel. As to
> whether Excel is doing the right thing, that is a different question
> entirely.
Okay. So the default behavior would be to *not* treat the quotes as
text qualifiers in the following:
data, "data", data
unless the user specifies otherwise.
> One of the people we have done work for has some very nasty "CSV" data
> to parse. We have been trying to work out what to do to the CSV
> module to handle some of the silly things he sees without breaking the
> Excel compatibility.
Having "variants" as Skip mentioned (and I think you did as well) would
solve this.
I'm also a bit curious as to the "Treat consecutive delimiters as one"
option in Excel. I had planned to add support for that in DSV but never
got around to it. Does csv have such an option? Is this really ever
useful? I've never had anyone request that I enable that option in DSV,
despite the fact that there's even a checkbox (disabled) for it in the
GUI.
>
> > The original line (from the csv file) is:
> >
> > Test 1, "Fred said ""hey!"", and left the room", ""
> >
> > The "" at the end is an empty, quoted field. Maybe someone should
> > run this through Excel to see what it claims (I'd be willing to
> > accept Dave's interpretation if Excel does it this way, although I'd
> > still feel it was incorrect). I handled this case specifically at a
> > user's request.
>
> Andrew, can you run that exact line through Excel?
>
> > > 10
> > > DSV: ['Test 9', 'no spaces around this', ' but single spaces around this ']
> > > csv: ['Test 9', ' "no spaces around this" ', ' but single spaces around this ']
> > > 12
> > > DSV: ['Test 11', 'has no spaces around anything', 'because the data is quoted']
> > > csv: [' "Test 11" ', ' "has no spaces around anything" ', ' "because the data is quoted" ']
> > >
> > > All the three lines have white space immediately following
> > > separating commas. DSV appears to skip over this white space,
> > > while csv treats it as part of the field contents.
>
> I am fairly sure that is what Excel does.
You're probably correct, but I'd like to be 100% certain on this.
> Pity there is no real specification for CSV.
Actually, it's only the V part of CSV that's poorly defined <wink>.
Maybe CSV should stand for "comma separated vagueness".
Speaking of names, since Kevin is correct in that people will look for
CSV since that's the common term, we could just define C to stand for
"character" rather than "comma", since this will be a general-purpose
importer.
--
Cliff Wells, Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308 (800) 735-0555 x308
More information about the Csv
mailing list