Anoying unicode / str conversion problem
Hans Müller
heintest at web.de
Mon Jan 26 15:21:01 EST 2009
Hi python experts,
in the moment I'm struggling with an annoying problem in conjunction with mysql.
I'm fetching rows from a database, which the mysql drive returns as a list of tuples.
The default coding of the database is utf-8.
Unfortunately in the database there are rows with different codings and there is a blob
column.
In the app. I search for double entries in the database with this code.
hash = {}
cursor.execute("select * from table")
rows = cursor.fetchall()
for row in rows:
key = "|".join([str(x) for x in row]) <- here the problem arises
if key in hash:
print "found double entry"
This code works as expected with python 2.5.2
With 2.5.1 it shows this error:
key = "|".join(str(x) for x in row)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u017e' in position 3: ordinal
not in range(128)
When I replace the str() call by unicode(), I get this error when a blob column is being
processed:
key = "|".join(unicode(x) for x in row)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 119: ordinal not in
range(128)
Please help, how can I convert ANY column data to a string which is usable as a key to a
dictionary. The purpose of using a dictionary is to find equal rows in some database
tables. Perhaps using a md5 hash from the column data is also an idea ?
Thanks a lot in advance,
Hans.
More information about the Python-list
mailing list