Python parsing iTunes XML/COM

Jerry Hill malaclypse2 at gmail.com
Wed Jul 30 11:10:25 EDT 2008


On Wed, Jul 30, 2008 at 10:58 AM, william tanksley
<wtanksleyjr at gmail.com> wrote:
> Here's one example. The others are similar -- they have the same
> things that look like problems to me.
>
> "Buffett Time - Annual Shareholders\xc2\xa0L.mp3"
>
> Note some problems here:
>
> 1. This isn't Unicode; it's missing the u"" (I printed using repr).
> 2. It's got the UTF-8 bytes there in the middle.
>
> I tried doing track_id.encode("utf-8"), but it doesn't seem to make
> any difference at all.

I don't have anything to say about your iTunes problems, but encode()
is the wrong method to turn a byte string into a unicode string.
Instead, use decode(), like this:

>>> track_id = "Buffett Time - Annual Shareholders\xc2\xa0L.mp3"
>>> utrack_id = track_id.decode('utf-8')
>>> type(utrack_id)
<type 'unicode'>
>>> print utrack_id
Buffett Time - Annual Shareholders L.mp3
>>> print repr(utrack_id)
u'Buffett Time - Annual Shareholders\xa0L.mp3'
>>>

-- 
Jerry



More information about the Python-list mailing list