[Python-Dev] Encoding detection in the standard library?

skip at pobox.com skip at pobox.com
Mon Apr 21 19:00:54 CEST 2008


    Michael> The only approach I know of is a heuristic based approach. e.g.

    Michael> http://www.voidspace.org.uk/python/articles/guessing_encoding.shtml

    Michael> (Which was 'borrowed' from docutils in the first place.)

Yes, I implemented a heuristic approach for the Musi-Cal web server.  I was
able to rely on domain knowledge to guess correctly almost all the time.
The heuristic was that almost all form submissions came from the US and the
rest which didn't came from Western Europe.  Python could never embed such a
narrow-focused heuristic into its core distribution.

Skip



More information about the Python-Dev mailing list