[ python-Bugs-1532483 ] the csv module writes files that Excel sees as SYLK files
SourceForge.net
noreply at sourceforge.net
Wed Aug 2 15:28:58 CEST 2006
Bugs item #1532483, was opened at 2006-08-01 09:52
Message generated for change (Settings changed) made by montanaro
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1532483&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
>Status: Pending
Resolution: None
Priority: 5
Submitted By: Vincent Povirk (madewokherd)
Assigned to: Skip Montanaro (montanaro)
Summary: the csv module writes files that Excel sees as SYLK files
Initial Comment:
I'm using python version 2.4.3
Apparently, when Excel 2003 reads a file, it looks for
the identifying string "ID" at the beginning of the
file. If it finds this string, it assumes it's reading
an SYLK file (see
http://netghost.narod.ru/gff/graphics/summary/micsylk.htm
for some information on SYLK).
The csv module will generate a file that starts with ID
if the first field it writes starts with ID and does
not need to be quoted. When Excel tries to open the
file, the following message pops up:
"Excel has detected that 'test.csv' is a SYLK file, but
cannot load it. Either the file has errors or it is not
a SYLK file format. Click OK to try to open the file in
a different format."
Excel can read the file after clicking OK. Excel
actually has the same problem with CSV files it has
written.
Even so, when using the 'excel' dialect, csv should
write files that Excel can open without any problems.
It could do this by quoting the first field in the file
if it begins with "ID". Unfortunately, csv's Dialect
class does not make this possible. I'm currently
working around it by using QUOTE_NONNUMERIC.
----------------------------------------------------------------------
>Comment By: Skip Montanaro (montanaro)
Date: 2006-08-02 08:28
Message:
Logged In: YES
user_id=44345
Vincent,
A simple workaround would be to define a fully quoting dialect:
class quoted_excel(csv.excel):
quoting=csv.QUOTE_NONNUMERIC # or QUOTE_ALL
That would cause your generated CSV files to start with a
quote character, e.g.:
"ID","FOO"
1,"bar"
2,"bAz"
Try that and see if it makes Excel 2003 happy.
Skip
----------------------------------------------------------------------
Comment By: Vincent Povirk (madewokherd)
Date: 2006-08-01 13:01
Message:
Logged In: YES
user_id=553355
Thanks for your response.
Yes, it's definitely a bug in Excel 2003 (as it also
complains about files it saved). I do not have a later
version of Excel to test.
Microsoft has a page about this issue that seems to say 2003
is the last version with that problem:
http://support.microsoft.com/kb/323626/
Their solution is worse than the problem, I'd be interested
in seeing how a later version behaves.
I know that if the first cell is quoted, Excel will open it
without complaining. I think the best solution would be to
quote the first cell if it starts with ID by introducing a
new QUOTE_ constant. I don't know how that part of the code
works (I'm too lazy to read things that aren't written in
python); maybe it's more reasonable to quote any field
starting with ID. I don't know of any other uses that would
break, but I'm not in touch with many csv users.
----------------------------------------------------------------------
Comment By: Skip Montanaro (montanaro)
Date: 2006-08-01 11:22
Message:
Logged In: YES
user_id=44345
Seems like a shortcoming in Excel 2003 to me, not a problem
with the csv module. Still, if you can suggest a change
that won't break many other uses of the csv module's output,
I'll consider it.
Have you tried the same test with a later version of Excel?
Skip
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1532483&group_id=5470
More information about the Python-bugs-list
mailing list