[Python-Dev] issue2180 and using 'tokenize' with Python 3 'str's
fuzzyman at voidspace.org.uk
Tue Sep 28 13:55:25 CEST 2010
On 28 September 2010 12:29, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> On 28/09/2010 12:19, Antoine Pitrou wrote:
>> On Mon, 27 Sep 2010 23:45:45 -0400
>> Steve Holden<steve at holdenweb.com> wrote:
>>> On 9/27/2010 11:27 PM, Benjamin Peterson wrote:
>>>> 2010/9/27 Meador Inge<meadori at gmail.com>:
>>>>> which, as seen in the trace, is because the 'detect_encoding' function
>>>>> 'Lib/tokenize.py' searches for 'BOM_UTF8' (a 'bytes' object) in the
>>>>> to tokenize 'first' (a 'str' object). It seems to me that strings
>>>>> still be able to be tokenized, but maybe I am missing something.
>>>>> Is the implementation of 'detect_encoding' correct in how it attempts
>>>>> determine an encoding or should I open an issue for this?
>>>> Tokenize only works on bytes. You can open a feature request if you
>>>> Working only on bytes does seem rather perverse.
>> I agree, the morality of bytes objects could have been better :)
>> The reason for working with bytes is that source data can only be
> correctly decoded to text once the encoding is known. The encoding is
> determined by reading the encoding cookie.
> I certainly wouldn't be opposed to an API that accepts a string as well
Ah, and to explain the design decision when tokenize was ported to py3k -
the Python 2 APIs take the readline method of a file object (not a string).
For this to work correctly in Python 3 it *has* to be a file object open in
binary read mode in order to decode the source code correctly.
A new API that takes a string would certainly be nice. The Python 2 API for
tokenize is 'interesting'...
All the best,
> All the best,
>> Python-Dev mailing list
>> Python-Dev at python.org
> READ CAREFULLY. By accepting and reading this email you agree, on behalf of
> your employer, to release me from all obligations and waivers arising from
> any and all NON-NEGOTIATED agreements, licenses, terms-of-service,
> shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure,
> non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have
> entered into with your employer, its partners, licensors, agents and
> assigns, in perpetuity, without prejudice to my ongoing rights and
> privileges. You further represent that you have the authority to release me
> from any BOGUS AGREEMENTS on behalf of your employer.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev