the unicode saga continues...
Ethan Furman
ethan at stoneleaf.us
Sat Nov 14 00:33:53 EST 2009
So I've added unicode support to my dbf package, but I also have some
rather large programs that aren't ready to make the switch over yet. So
as a workaround I added a (rather lame) option to convert the
unicode-ified data that was decoded from the dbf table back into an
encoded format.
Here's the fun part: in figuring out what the option should be for use
with my system, I tried some tests...
Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print u'\xed'
í
>>> print u'\xed'.encode('cp437')
í
>>> print u'\xed'.encode('cp850')
í
>>> print u'\xed'.encode('cp1252')
φ
>>> import locale
>>> locale.getdefaultlocale()
('en_US', 'cp1252')
My confusion lies in my apparant codepage (cp1252), and the discrepancy
with character u'\xed' which is absolutely an i with an accent; yet when
I encode with cp1252 and print it, I get an o with a line.
Can anybody clue me in to what's going on here?
~Ethan~
More information about the Python-list
mailing list