First Cut at CSV PEP
Cliff Wells
LogiplexSoftware at earthlink.net
Tue Jan 28 23:21:29 CET 2003
On Tue, 2003-01-28 at 13:55, Skip Montanaro wrote:
> Kevin> Probably need to specify that input and output deals with string
> Kevin> representations, but there are some differences:
>
> Kevin> [[5,'Bob',None,1.0]]
>
> Kevin> DSV.exportCSV produces
>
> Kevin> '5,Bob,None,1.0'
>
> I'm not so sure this mapping None to "None" on output is such a good idea
Not unless bugs are good ideas ;) Apparently the export stuff in DSV
isn't as widely used as this went unnoticed. It is incorrect behavior.
> because it's not reversible in all situations and hurts portability to other
> systems (e.g., does Excel have a concept of None? what happens if you have a
> text field which just happens to contain "None"?). I think we need to limit
> the data which can be output to strings, Unicode strings (if we use an
> encoded stream), floats and ints. Anything else should raise TypeError.
Or be converted to a reasonable string alternative, ie None -> ''
> Kevin> I'm still not sure about some of the database CSV handling
> Kevin> issues, often it seems they want a string field to be quoted
> Kevin> regardless of whether it contains a comma or newlines, but number
> Kevin> and empty field should not be quoted. It is certainly nice to be
> Kevin> able to import a file that contains
>
> Kevin> 5,"Bob",,1.0\r\n
>
> Kevin> and not need to do any further translation. Excel appears to
> Kevin> interpret quoted numbers and unquoted numbers as numeric fields
> Kevin> when importing.
>
> I like my CSV files to be fully quoted (even fields which may contain
> numbers), largely because it makes later (dangerous) matching using regular
> expressions simpler. Otherwise I wind up having to make all the quotes in
> the regular expressions optional. It just complicates things.
Excel only quotes when necessary during export. However, it doesn't
care on import which style is used. Allowing the programmer to specify
the style in this regard would be a good thing.
> Kevin> Just trying to be anal-retentive here to make sure all the issues
> Kevin> are covered ;-)
>
> I hear ya.
>
> I just did a little fiddling in Excel 2000 with some simple values. When I
> save as CSV, it doesn't give me the option to change the delimiter or quote
> character. Nor could I figure out how to embed a newline in a cell. It
> certainly doesn't seem as flexible as Gnumeric in this regard. Can someone
> provide me with some hints?
Don't save as CSV, save as TSV, which is the same, but with tabs rather
than commas. I don't know that it allows specifying the quote
character.
IIRC, you can embed a newline in a cell by entering " in a cell to mark
it as a string value, then I think you can then just hit enter (or
perhaps ctrl+enter).
--
Cliff Wells, Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308 (800) 735-0555 x308
More information about the Csv
mailing list