Puzzled by code pages

Adam Tauno Williams awilliam at whitemice.org
Fri May 14 20:38:43 EDT 2010


On Fri, 2010-05-14 at 20:27 -0400, Adam Tauno Williams wrote:
> I'm trying to process OpenStep plist files in Python.  I have a parser
> which works, but only for strict ASCII.  However plist files may contain
> accented characters - equivalent to ISO-8859-2 (I believe).  For example
> I read in the line:
> 
> >>> handle = open('file.txt', 'rb')
> >>> data = handle.read()
> >>> handle.close()
> >>> data
> '    "skyp4_filelist_10201/localit\xc3\xa0 termali_sortfield" =
> NSFileName;\n'
> What is the correct way to re-encode this data into UTF-8 so I can use
> unicode strings, and then write the output back to ISO8859-?

Typical, 30 seconds after giving up and posting a message... I find the
problem.

Buried in the parser is a str(...) call.  Replacing that with
unicode(...) and now the OpenSTEP plist parser is working with Italian
plists.




More information about the Python-list mailing list