[New-bugs-announce] [issue16223] untokenize returns a string if no encoding token is recognized

Sun Oct 14 07:49:11 CEST 2012

New submission from Eric Snow:

If you pass an iterable of tokens and none of them are an ENCODING token, tokenize.untokenize() returns a string.  This is contrary to what the docs say:

   It returns bytes, encoded using the ENCODING token, which is the
   first token sequence output by tokenize().

Either the docs should be clarified or untokenize() fixed.  My vote is to fix it.  It could check that the first token is an ENCODING token and raise an exception.  Alternately it could fall back to using 'utf-8' by default.

[1] http://docs.python.org/py3k/library/tokenize.html#tokenize.untokenize

----------
messages: 172850
nosy: eric.snow
priority: normal
severity: normal
stage: test needed
status: open
title: untokenize returns a string if no encoding token is recognized
type: behavior
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue16223>
_______________________________________