CSV readers and UTF-8 files

mk mrkafk at gmail.com
Thu Feb 19 09:21:14 EST 2009


Hello everyone,

Is it just me or CSV reader/DictReader and UTF-8 files do not work 
correctly in Python 2.6.1 (Windows)?

That is, when I open UTF-8 file in a csv reader (after passing plain 
file object), I get fields as plain strings ('str'). Since this has been 
mangled, I can't get the non-ascii characters back.

When I do:

     csvfo = codecs.open(csvfname, 'rb', 'utf-8')
     dl = csv.excel
     dl.delimiter=';'
     #rd = csv.DictReader(csvfo, dialect=dl)
     rd = csv.reader(csvfo, dialect=dl)

..I get plain strings as well (I get <type 'str'> when calling 
type(field)), on top of error:

Traceback (most recent call last):
   File "C:/Python26/converter3.py", line 99, in <module>
     fill_sqla(session,columnlist,rd)
   File "C:/Python26/converter3.py", line 73, in fill_sqla
     for row in rd:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0144' in 
position 74: ordinal not in range(128)

..when doing:

     for row in rd:
...

Regards,
mk




More information about the Python-list mailing list