[Python-Dev] What does a double coding cookie mean?

Guido van Rossum guido at python.org
Thu Mar 17 10:55:51 EDT 2016


On Thu, Mar 17, 2016 at 5:04 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
>> Should we recommend that everyone use tokenize.detect_encoding()?
>
> Likely. However the interface of tokenize.detect_encoding() is not very
> simple.

I just found that out yesterday. You have to give it a readline()
function, which is cumbersome if all you have is a (byte) string and
you don't want to split it on lines just yet. And the readline()
function raises SyntaxError when the encoding isn't right. I wish
there were a lower-level helper that just took a line and told you
what the encoding in it was, if any. Then the rest of the logic can be
handled by the caller (including the logic of trying up to two lines).

-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-Dev mailing list