tolidtm at gmail.com
Mon Dec 17 23:33:48 CET 2012
>> Just realize that once you start using 'ignore' you're going to also
>> ignore discrepancies that are real. For example, maybe your terminal is
>> actual something other than either latin-1 or utf-8.
> If you need to see such discrepancies, you can do
> print src.decode("utf-8").encode("latin-1", ""xmlcharrefreplace")
> That would produce something like:
> processeurs Intel® Core™ de 3ème génération av
> that is, the problem characters are displayed in &#...; notation.
> That is ugly, but sometimes it's the only way to see what character
> you really have.
> Notice that the number you get is in decimal, where the \u....
> notation uses hex:
Thanks guys my issue is now solved - the problem came from my Putty
client, it was on latin1 by default and changing it to utf-8, now
More information about the Python-list