[Python-ideas] Transportable indent level markers. >>>===<<<

Ron Adam ron3200 at gmail.com
Thu Dec 15 06:09:17 CET 2011


On Thu, 2011-12-15 at 13:40 +1000, Nick Coghlan wrote:
> On Thu, Dec 15, 2011 at 1:19 PM, Ron Adam <ron3200 at gmail.com> wrote:
> >> You'd probably also want an explicit ";;" token to force a
> >> token.NEWLINE into the token stream.
> >
> > That isn't needed.  Any of these in the middle of a line will add a new
> > line and back up, so the next call to tok_get() will find it, and so on.
> 
> OK, take the way you're thinking (indent +1, indent 0, indent -1) and
> instead think in terms of starting a suite, terminating a statement
> and terminating a suite:
> 
> /// -> {:
> ;;; -> ;;
> \\\ -> :}
> 
> Now do you see why I'm saying you're needlessly complicating things?

The complicated part is in getting across a new idea. ;-)


> Suite delimiters and statement terminators (or separators) are the way
> full whitespace insensitivity is normally handled when designing a
> language syntax. There's no reason to get creative here when the
> standard terminology and conventions would work just fine.
> 
> For example, it shouldn't be difficult to create a variant of the
> tokenize module's tokeniser that adds the following rules:

I'm not referring to a whole new variant, Just a small tweak to the
current one.


> {: -> emits OP(':'), NEWLINE, INDENT and increments the parenlevel
> ;; -> emits NEWLINE
> :} -> emits NEWLINE, DEDENT and decrements the parenlevel

This is the same idea with different spelling in this case.  I think
yours would be a bit harder to implement, but not all that hard to do.
The version I proposed would be even easier to do.

Unless you intend to enforce matching '{:' and ':}''s, then it become a
bit harder still.  I'm not sure that is needed, but it may seem strange
to users if you don't require them to match.  At what point should it be
a part of the grammar?


Ok how about some examples... how would you notate this code bit so it
can be pasted into an already existing file without having to re-edit
it?

  ;;; for x in items:
  \\\ sum = op(x)
  ;;; if sum > MAX:
  \\\ break

Lets say you are going to past in several locations in a file and those
locations may have different indent levels.  How would you write that
using your suggestion so you don't have to change it in anyway for it to
work.

With my suggestion, that would just work no matter the source file
indent level. (*)  All you need is the line numbers to insert it at.  I
think with your idea, you would also need a place holder of some type.

   * Assuming the required names are present. ;-)


> and a variant of the untokenizer() that looks ahead and *emits* those
> character sequences when applicable.

That shouldn't be a problem in either case.

> That should be enough to let you use Python code in whitespace
> insensitive environments without changing the semantics:

Right, except I'd be happier with something that doesn't alter the
visible characters, the colon included.

> Encoding for transport: tokenize() normally, untokenize() with suite delimiters
> Decoding from transport: tokenize() with suite delimiter support,
> untokenize() normally

I think one of the differences is I'm proposing a mode that doesn't
require the whole file to be in matching form.

So the "tokenize() with suite delimiter support" would just be the same
tokenize() that is normally used.  And the "untokenize() with suite
delimiters" would probably just be a file that you run on the source to
get a delimited version.  I'm not proposing any untokenize() support.
But I'm not against it either. <shrug>  

I also think indent markers are more consistent with how people think
when they are programming python.  It matches how the blocks are
defined.

If this was a language where we already had braces, then I'd definitely
be thinking the other way.  :-)

Cheers,
   Ron





More information about the Python-ideas mailing list