Bug in example code for the CSV library

Good day Python maintainers, I recently copied and pasted the example code at the bottom of the CSV module documentation. This code is designed to allow you to read and write unicode CSV files (by default, UTF-8). However, it doesn't work on quite minimal examples, in a way likely to stump beginners. I assume you have the code at the bottom of the CSV documentation page in a file called CSVunicode.py and it's in your path (I didn't bother attaching: lots of email systems are going to strip anything like Python attachments off an email, anyways).
The problem, of course, is that there is no guarantee that iterable argument sent to the writerow method each have a method "encode": they are quite likely to be integers. I don't have a simple suggestion about how to fix this. You could wrap "s" in the line that throws the AttributeError with the str() function (or check to see if hasattr(s, 'encode') before doing so), but that will break the functionality of the csv.QUOTE_* kwargs, which differentiate between numeric row values and their stringified variants. The whole issue seems to indicate the wrong level of abstraction, unfortunately (to my eyes). I'd be glad to help if I can be of any use as a writer (I used to work as a technical writer) or developer; Python means a lot to me. Yours, Kyle Gorman Center for Spoken Language Understanding, Oregon Health & Science University

I now have a bit of a solution: the offending line can be rewritten: self.writer.writerow([s.encode('utf-8') if hasattr(s, 'encode') else s for s in row]) In other words, encode s into UTF-8 if it's encodeable, otherwise keep as is: it's probably numeric. That should do for most use cases: it only breaks if s.__str__ returns non-ASCII Unicode code points, as far as I can tell. I confirmed this works nicely with quoting=QUOTE_NONNUMERIC. Kyle On Dec 4, 2012, at 9:01 AM, Kyle Gorman <gormanky@ohsu.edu> wrote:

I now have a bit of a solution: the offending line can be rewritten: self.writer.writerow([s.encode('utf-8') if hasattr(s, 'encode') else s for s in row]) In other words, encode s into UTF-8 if it's encodeable, otherwise keep as is: it's probably numeric. That should do for most use cases: it only breaks if s.__str__ returns non-ASCII Unicode code points, as far as I can tell. I confirmed this works nicely with quoting=QUOTE_NONNUMERIC. Kyle On Dec 4, 2012, at 9:01 AM, Kyle Gorman <gormanky@ohsu.edu> wrote:
participants (1)
-
Kyle Gorman