[Csv] Re: PEP 305 - CSV File API
Skip Montanaro
skip at pobox.com
Tue Feb 4 05:00:20 CET 2003
Carlos> The problem is, almost all my intermediate files have both
Carlos> 'date' and 'float' columns. This is highly common in business,
Carlos> specially if you are looking at sales figures and stuff like
Carlos> that.
Carlos> To compound my problem, Python writes floats with a period (.)
Carlos> as a decimal separator. However, my copy of Excel is configured
Carlos> for the brazilian locale, and it expects a comma (,) as the
Carlos> decimal separator.
Can't you simply set the locale in your scripts so Python and Excel agree?
Carlos> Now for the real issue. If I convert my floats to strings
Carlos> *before* writing the CSV file, It will end up quoted (for
Carlos> example, '3,1416') - assuming that the CSV library will work as
Carlos> Skip said. This is not what I would expect, and in fact, it's
Carlos> not what anyone working with different locale settings would
Carlos> say.
It would only be quoted if you had comma as the delimiter or had set the
quoting parameter to QUOTE_ALWAYS. What delimiter do you use in your CSV
files?
Carlos> Last, even if Python just wrote floats with the 'right' decimal
Carlos> separator - comma, in my case - there still would be other
Carlos> software packages that would expect to get periods.
How would you like us to handle this? Sound like a case of being "damned if
we do, damned if we don't".
Carlos> Or worse, I could try to send my data files to people in other
Carlos> countries that would be unable to read it. In any event, there
Carlos> is no automatic solution, but the ability to quickly adjust the
Carlos> CSV library to get the correct behavior would be highly useful.
We have to come back to the fundamental issue that CSV files as commonly
understood contain no data type information. It's possible that type
information could be passed in during write operations which would govern
the way the data is formatted when written. (We've discussed it, but it's
not likely to be in the first release.)
Even if we solve the formatting issue, once the data is written out to the
file, if you ship it out of your locale, no information remains in the file
to indicate that 3,1416 is a number instead of a string containing digits
and a comma. Similarly, if you choose to write dates out in an ambiguous
format, at the receiving end, the reader won't be able to tell what date
"02/03/03" represents.
Skip
More information about the Csv
mailing list