Python parsing iTunes XML/COM
malaclypse2 at gmail.com
Thu Jul 31 23:45:48 CEST 2008
On Thu, Jul 31, 2008 at 9:44 AM, william tanksley <wtanksleyjr at gmail.com> wrote:
> I'm using a file, a file that's correctly encoded as UTF-8, and it
> returns some text elements that are raw bytes (undecoded). I have to
> manually decode them.
I can't reproduce this behavior. Here's a simple test case:
C:\Program Files\Python25>python -V
C:\Program Files\Python25>more t.py
import xml.etree.cElementTree as ET
xml_string = """<?xml version="1.0" encoding="UTF-8"?>
<character title="GREEK SMALL LETTER PI">\xcf\x80</character>"""
outfile = open('sample.xml', 'wb')
tree = ET.parse('sample.xml')
root = tree.getroot()
C:\Program Files\Python25>python t.py
That seems to work as expected. I wrote out a UTF-8 encoded
bytestring with a proper xml encoding statement. When I parsed the
file with cElementTree, it returned unicode data. Does this same
program work for you? If so, maybe you need to show us more of your
code to see where things are going wrong.
More information about the Python-list