[Python-Dev] What does a double coding cookie mean?
mal at egenix.com
Fri Mar 18 16:05:27 EDT 2016
On 17.03.2016 15:55, Guido van Rossum wrote:
> On Thu, Mar 17, 2016 at 5:04 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
>>> Should we recommend that everyone use tokenize.detect_encoding()?
>> Likely. However the interface of tokenize.detect_encoding() is not very
> I just found that out yesterday. You have to give it a readline()
> function, which is cumbersome if all you have is a (byte) string and
> you don't want to split it on lines just yet. And the readline()
> function raises SyntaxError when the encoding isn't right. I wish
> there were a lower-level helper that just took a line and told you
> what the encoding in it was, if any. Then the rest of the logic can be
> handled by the caller (including the logic of trying up to two lines).
I've uploaded the code I posted yesterday, modified to address
some of the issues it had to github:
I'm pretty sure the two-lines read can be optimized away and
put straight into the regular expression used for matching.
Professional Python Services directly from the Experts (#1, Mar 18 2016)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
2016-03-07: Released eGenix pyOpenSSL 0.13.14 ... http://egenix.com/go89
2016-02-19: Released eGenix PyRun 2.1.2 ... http://egenix.com/go88
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
More information about the Python-Dev