Python parsing iTunes XML/COM

Jerry Hill malaclypse2 at
Thu Jul 31 23:45:48 CEST 2008

On Thu, Jul 31, 2008 at 9:44 AM, william tanksley <wtanksleyjr at> wrote:
> I'm using a file, a file that's correctly encoded as UTF-8, and it
> returns some text elements that are raw bytes (undecoded). I have to
> manually decode them.

I can't reproduce this behavior.  Here's a simple test case:

C:\Program Files\Python25>python -V
Python 2.5.2

C:\Program Files\Python25>more
import xml.etree.cElementTree as ET

xml_string = """<?xml version="1.0" encoding="UTF-8"?>
<character title="GREEK SMALL LETTER PI">\xcf\x80</character>"""

outfile = open('sample.xml', 'wb')

tree = ET.parse('sample.xml')
root = tree.getroot()
print type(root.text)
print repr(root.text)
print root.text

C:\Program Files\Python25>python
<type 'unicode'>

That seems to work as expected.  I wrote out a UTF-8 encoded
bytestring with a proper xml encoding statement.  When I parsed the
file with cElementTree, it returned unicode data.  Does this same
program work for you?  If so, maybe you need to show us more of your
code to see where things are going wrong.


More information about the Python-list mailing list