right curly quote and unicode

TiNo tinodb at gmail.com
Wed Oct 18 00:34:13 CEST 2006

Hi all,

I am trying to compare my Itunes Library xml to the actual files on my
As the xml file is in UTF-8 encoding, I decided to do the comparison of the
filenames in that encoding.
It all works, except with one file. It is named 'The Chemical
Brothers-Elektrobank-04 - Don't Stop the Rock (Electronic Battle Weapon
Version).mp3'. It goes wrong with the apostrophe in Don't. That is actually
not an apostrophe, but ASCII char 180: ´
In the Itunes library it is encoded as: Don%E2%80%99t

I do some some conversions with both the library path names and the folder
path names. Here is the code:
(in the comment I dispay how the Don't part looks. I got this using print
#Once I have the filenames from the library I clean them using the following
code (as filenames are in the format '

filename = urlparse.urlparse(filename)[2][1:]  # u'Don%E2%80%99t' ; side
question, anybody who nows a way to do this in a more fashionable way?
filename = urllib.unquote(filename) # u'Don\xe2\x80\x99t'
filename = os.path.normpath(filename) # u'Don\xe2\x80\x99t'

I get the files in my music folder with the os.walk method and then
I do:

filename = os.path.normpath(os.path.join(root,name))  # 'Don\x92t'
filename = unicode(filename,'latin1') # u'Don\x92t'
filename = filename.encode('utf-8') # 'Don\xc2\x92t'
filename = unicode(filename,'latin1') # u'Don\xc2\x92t'


I think the folder part is a bit weird with the unicode, encode, unicode
conversions, but it works for all the other songs. Of which some contain
latin1 characters like accented e and a, and c's with an s underneath them.
Only this wierd quote apostrophe thingy is not working. What am I doing
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20061018/482d2b1e/attachment.html>

More information about the Python-list mailing list