regular expressions, unicode and XML

Justin Ezequiel justin.mailinglists at
Fri Jan 27 09:20:25 CET 2006

>> when I replace it end up with nothing: i.e., just a "" character in my
>> file.

how are you viewing the contents of your file?
are you printing it out to stdout?
are you opening your file in a non-unicode aware editor?
try print repr(data) after re.sub so that you see what you actually
have in data

btw, from where did you get you XML files?

