Mailman 3 [issue23297] Clarify error when ‘tokenize.detect_encoding’ receives text - docs

28 Apr 2019

      Berker Peksag <berker.peksag@gmail.com> added the comment:

The original problem has already been solved by making tokenize.generate_tokens() public in issue 12486.

However, the same exception can be raised when tokenize.open() is used with tokenize.tokenize(), because it returns a text stream:

    https://github.com/python/cpython/blob/da63b321f63b697f75e7ab2f88f55d907f56c...

hello.py
--------

def say_hello():
    print("Hello, World!")

say_hello()

text.py
-------

import tokenize

with tokenize.open('hello.py') as f:
    token_gen = tokenize.tokenize(f.readline)
    for token in token_gen:
        print(token)

When we pass f.readline to tokenize.tokenize(), the second call to detect_encoding() fails, because f.readline() returns str.

In Lib/test/test_tokenize.py, it seems like tokenize.open() is only tested to open a file. Its output isn't passed to tokenize.tokenize(). Most of the tests either pass the readline() method of open(..., 'rb') or io.BytesIO() to tokenize.tokenize().

I will submit a documentation PR that suggests to use tokenize.generate_tokens() with tokenize.open().

----------
assignee:  -> docs@python
components: +Documentation
nosy: +docs@python
versions: +Python 3.7, Python 3.8 -Python 3.5, Python 3.6

_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue23297>
_______________________________________

[issue23297] Clarify error when ‘tokenize.detect_encoding’ receives text

Berker Peksag

Ben Finney

tags

participants (2)