Python parsing iTunes XML/COM

John Machin sjmachin at
Thu Jul 31 10:43:58 CEST 2008

On Jul 31, 12:58 am, william tanksley <wtanksle... at> wrote:
> Thank you for the response. Here's some more info, including a little
> that you didn't ask me for but which might be useful.
> John Machin <sjmac... at> wrote:
> > william tanksley <wtanksle... at> wrote:
> > > To ask another way: how do I convert from a file:// URL to a local
> > > path in a standard way, so that filepaths from two different sources
> > > will work the same way in a dictionary?
> > > The problems occur when the filenames have non-ascii characters in
> > > them -- I suspect that the URLs are having some encoding placed on
> > > them that Python's decoder doesn't know about.
> > # track_id = url2pathname(urlparse(track_id).path)
> > print repr(track_id)
> > parse_result = urlparse(track_id).path
> > print repr(parse_result)
> > track_id_replacement = url2pathname(parse_result)
> > print repr(track_id_replacement)
> The "important" value here is track_id_replacement; it contains the
> data that's throwing me. It appears that some UTF-8 characters are
> being read as multiple bytes by ElementTree rather than being decoded
> into Unicode.

Appearances can be deceptive. You present no evidence.

> Could this be a bug in ElementTree's Unicode support?

It could, yes, but the probability is extremely low.

> If
> so, can I work around it?
> Here's one example. The others are similar -- they have the same
> things that look like problems to me.
> "Buffett Time - Annual Shareholders\xc2\xa0L.mp3"
> Note some problems here:


> 1. This isn't Unicode; it's missing the u"" (I printed using repr).
> 2. It's got the UTF-8 bytes there in the middle.
> I tried doing track_id.encode("utf-8"), but it doesn't seem to make
> any difference at all.
> Of course, my ultimate goal is to compare the track_id to the track_id
> I get from iTunes' COM interface, including hashing to the same value
> for dict lookups.
> > and copy/paste the results into your next posting.
> In addition to the above results,

*WHAT* results? I don't see any repr() output, just your
interpretation of what you think you saw!

More information about the Python-list mailing list