[Python-Dev] Small tweak to tokenize.py?

Thu Nov 30 22:46:01 CET 2006

On 11/30/06, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 10:28 AM 11/30/2006 -0800, Guido van Rossum wrote:
> >Are you opposed changing tokenize? If so, why (apart from
> >compatibility)?
>
> Nothing apart from compatibility.  I think you should have to explicitly
> request the new behavior(s), since tools (like detokenize) written to work
> around the old behavior might behave oddly with the change.

Can you test it with this new change (slightly different from before)?
It reports a NL pseudo-token with as its text value '\\\n' (or
'\\\r\n' if the line ends in \r\n).

@@ -370,6 +370,8 @@
                 elif initial in namechars:                 # ordinary name
                     yield (NAME, token, spos, epos, line)
                 elif initial == '\\':                      # continued stmt
+                    # This yield is new; needed for better idempotency:
+                    yield (NL, token, spos, (lnum, pos), line)
                     continued = 1
                 else:
                     if initial in '([{': parenlev = parenlev + 1

> Mainly, though, I thought you might find the code useful, given the nature
> of your project.  (Although I suppose you've probably already written
> something similar.)

Indeed.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)