Other notes

Bengt Richter bokr at oz.net
Thu Jan 6 22:09:45 CET 2005


On Thu, 06 Jan 2005 19:24:52 GMT, Andrew Dalke <dalke at dalkescientific.com> wrote:

>Me
>>>>> (BTW, it needs to be 1 .. 12 not 1..12 because 1. will be interpreted
>>>>> as the floating point value "1.0".)<
>
>Steve Holden:
>> Indeed, but if ".." is defined as an acceptable token then there's 
>> nothing to stop a strict LL(1) parser from disambiguating the cases in 
>> question. "Token" is not the same thing as "character".
>
>Python's tokenizer is greedy and doesn't take part in the
>lookahead.  When it sees 1..12 the longest match is for "1."
>
But it does look ahead to recognize += (i.e., it doesn't generate two
successive also-legal tokens of '+' and '=')
so it seems it should be a simple fix.

 >>> for t in tokenize.generate_tokens(StringIO.StringIO('a=b+c; a+=2; x..y').readline):print t
 ...
 (1, 'a', (1, 0), (1, 1), 'a=b+c; a+=2; x..y')
 (51, '=', (1, 1), (1, 2), 'a=b+c; a+=2; x..y')
 (1, 'b', (1, 2), (1, 3), 'a=b+c; a+=2; x..y')
 (51, '+', (1, 3), (1, 4), 'a=b+c; a+=2; x..y')
 (1, 'c', (1, 4), (1, 5), 'a=b+c; a+=2; x..y')
 (51, ';', (1, 5), (1, 6), 'a=b+c; a+=2; x..y')
 (1, 'a', (1, 7), (1, 8), 'a=b+c; a+=2; x..y')
 (51, '+=', (1, 8), (1, 10), 'a=b+c; a+=2; x..y')
 (2, '2', (1, 10), (1, 11), 'a=b+c; a+=2; x..y')
 (51, ';', (1, 11), (1, 12), 'a=b+c; a+=2; x..y')
 (1, 'x', (1, 13), (1, 14), 'a=b+c; a+=2; x..y')
 (51, '.', (1, 14), (1, 15), 'a=b+c; a+=2; x..y')
 (51, '.', (1, 15), (1, 16), 'a=b+c; a+=2; x..y')
 (1, 'y', (1, 16), (1, 17), 'a=b+c; a+=2; x..y')
 (0, '', (2, 0), (2, 0), '')

Regards,
Bengt Richter



More information about the Python-list mailing list