[issue9969] tokenize: add support for tokenizing 'str' objects

Meador Inge report at bugs.python.org
Tue Sep 28 15:17:19 CEST 2010

New submission from Meador Inge <meadori at gmail.com>:

Currently with 'py3k' only 'bytes' objects are accepted for tokenization:

>>> import io
>>> import tokenize
>>> tokenize.tokenize(io.StringIO("1+1").readline)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/minge/Code/python/py3k/Lib/tokenize.py", line 360, in tokenize
    encoding, consumed = detect_encoding(readline)
  File "/Users/minge/Code/python/py3k/Lib/tokenize.py", line 316, in detect_encoding
    if first.startswith(BOM_UTF8):
TypeError: Can't convert 'bytes' object to str implicitly
>>> tokenize.tokenize(io.BytesIO(b"1+1").readline)
<generator object _tokenize at 0x1007566e0>

In a discussion on python-dev (http://www.mail-archive.com/python-dev@python.org/msg52107.html) it was generally considered to be a good idea to add support for tokenizing 'str' objects as well.

messages: 117516
nosy: meador.inge
priority: normal
severity: normal
stage: needs patch
status: open
title: tokenize: add support for tokenizing 'str' objects
type: feature request
versions: Python 3.2, Python 3.3

Python tracker <report at bugs.python.org>

More information about the Python-bugs-list mailing list