[Tutor] Urgent: unicode problems writing CSV file

Alex Hall ahall at autodist.com
Wed Jun 8 14:12:40 EDT 2016

Thanks for all the responses, everyone, what you all said makes sense. I
also understand what you mean by the tone of an "urgent" message coming
across as demanding.

On Wed, Jun 8, 2016 at 1:19 PM, Tim Golden <mail at timgolden.me.uk> wrote:

> On 08/06/2016 14:54, Alex Hall wrote:
> > All,
> > I'm working on a project that writes CSV files, and I have to get it done
> > very soon. I've done this before, but I'm suddenly hitting a problem with
> > unicode conversions. I'm trying to write data, but getting the standard
> > cannot encode character: ordinal not in range(128)
> >
> > I've tried
> > str(info).encode("utf8")
> > str(info).decode(utf8")
> > unicode(info, "utf8")
> > csvFile = open("myFile.csv", "wb", encoding="utf-8") #invalid keyword
> > argument
> >
> > What else can I do? As I said, I really have to get this working soon,
> but
> > I'm stuck on this stupid unicode thing. Any ideas will be great. Thanks.
> >
> This is a little tricky. I assume that you're on Python 2.x (since
> open() isn't taking an encoding). Deep in the bowels of the CSV module's
> C implmentation is code which converts every item in the row it's
> receiving to a string. (Essentially does: [str(x) for x in row]). Which
> will assume ascii: there's no opportunity to specify an encoding.
> For things whose __str__ returns something ascii-ish, that's fine. But
> if your data does or is likely to contain non-ascii data, you'll need to
> preprocess it. How you do it, and how general-purpose that approach is
> will depend on your data. For the purposes of discussion, let's assume
> your data looks like this:
> unicode, int, int
> Then your encoder could do this:
> def encoder_of_rows(row):
>   return [row[0].encode("utf-8"), str(row[1]), str(row[2])]
> and your csv processor could do this:
> rows = [...]
> with open("filename.csv", "wb") as f:
>   writer = csv.writer(f)
>   writer.writerows([encoder_of_rows(row) for row in rows])
> but if could be more (or less) complex than that depending on your data
> and how much you know about it.
