[Python-Dev] issue2180 and using 'tokenize' with Python 3 'str's
Steve Holden
steve at holdenweb.com
Tue Sep 28 05:45:45 CEST 2010
On 9/27/2010 11:27 PM, Benjamin Peterson wrote:
> 2010/9/27 Meador Inge <meadori at gmail.com>:
>> which, as seen in the trace, is because the 'detect_encoding' function in
>> 'Lib/tokenize.py' searches for 'BOM_UTF8' (a 'bytes' object) in the string
>> to tokenize 'first' (a 'str' object). It seems to me that strings should
>> still be able to be tokenized, but maybe I am missing something.
>> Is the implementation of 'detect_encoding' correct in how it attempts to
>> determine an encoding or should I open an issue for this?
>
> Tokenize only works on bytes. You can open a feature request if you desire.
>
Working only on bytes does seem rather perverse.
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
DjangoCon US September 7-9, 2010 http://djangocon.us/
See Python Video! http://python.mirocommunity.org/
Holden Web LLC http://www.holdenweb.com/
More information about the Python-Dev
mailing list