unicode and hashlib
Jeff H
dundeemt at gmail.com
Fri Nov 28 11:11:03 EST 2008
hashlib.md5 does not appear to like unicode,
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa6' in
position 1650: ordinal not in range(128)
After googling, I've found BDFL and others on Py3K talking about the
problems of hashing non-bytes (i.e. buffers)
http://www.mail-archive.com/python-3000@python.org/msg09824.html
So what is the canonical way to hash unicode?
* convert unicode to local
* hash in current local
???
but what if local has ordinals outside of 128?
Is this just a problem for md5 hashes that I would not encounter using
a different method? i.e. Should I just use the built-in hash function?
More information about the Python-list
mailing list