Detect character encoding

Scott David Daniels scott.daniels at acm.org
Sun Dec 4 10:02:25 EST 2005


Michal wrote:
> Hello,
> is there any way how to detect string encoding in Python?
> 
> I need to proccess several files. Each of them could be encoded in 
> different charset (iso-8859-2, cp1250, etc). I want to detect it, and 
> encode it to utf-8 (with string function encode).
> 
> Thank you for any answer
> Regards
> Michal
The two ways to detect a string's encoding are:
   (1) know the encoding ahead of time
   (2) guess correctly

This is the whole point of Unicode -- an encoding that works for _lots_
of languages.

--Scott David Daniels
scott.daniels at acm.org



More information about the Python-list mailing list