[Python-Dev] Encoding detection in the standard library?
wolever at cs.toronto.edu
Tue Apr 22 03:41:51 CEST 2008
On 21-Apr-08, at 5:31 PM, Martin v. Löwis wrote:
>> This is useful when you get a hunk of data which _should_ be some
>> sort of intelligible text from the Big Scary Internet (say, a posted
>> web form or email message), and you want to do something useful with
>> it (say, search the content).
> I don't think that should be part of the standard library. People
> will mistake what it tells them for certain.
As Oleg mentioned, if the method is called something like
'guess_encoding', I think we could live with clear consciences.
IMO, encoding estimation is something that many web programs will
have to deal with, so it might as well be built in; I would prefer
the option to run `text=input.encode('guess')` (or something similar)
than relying on an external dependency or worse yet using a hand-
More information about the Python-Dev