Encoding sniffer?
Diez B. Roggisch
deets at nospam.web.de
Thu Jan 5 15:56:43 EST 2006
> print try_encodings(text, ['ascii', 'utf-8', 'iso8859_1', 'cp1252', 'macroman']
I've fallen into that trap before - it won't work after the iso8859_1.
The reason is that an eight-bit encoding have all 256 code-points
assigned (usually, there are exceptions but you have to be lucky to have
a string that contains a value not assigned in one of them - which is
highly unlikely)
AFAIK iso-8859-1 has all codepoints taken - so you won't go beyond that
in your example.
Regards,
Diez
More information about the Python-list
mailing list