Detect character encoding
The new guy
not at interesting.com
Mon Dec 5 23:59:49 EST 2005
Michal wrote:
> Hello,
> is there any way how to detect string encoding in Python?
>
> I need to proccess several files. Each of them could be encoded in
> different charset (iso-8859-2, cp1250, etc). I want to detect it, and
> encode it to utf-8 (with string function encode).
Well, about how to detect it in Python, I can't help. My first guess,
though, would be to have a look at the source code of the "file" utility.
This is an example of what it does:
# ls
de.i18n en.i18n
# file *
de.i18n: ISO-8859 text, with very long lines
en.i18n: ISO-8859 English text, with very long lines
cheers
More information about the Python-list
mailing list