[IPython-dev] Splitting inputs, cell inputs in an IPython client

Wed Jul 17 23:13:36 EDT 2013

We've recently noticed an issue when hosting IPython in PTVS - we split the 
inputs differently, which means when you have an input with multiple 
statements that build up a plot, we end up outputting the individual portions 
of the plot rather than one final plot at the end.

I've looked at how IPython is splitting the inputs and noticed a couple of things.  
First there's this doc string in the InputSplitter code:

        Return whether a block of interactive input can accept more input.

        This method is meant to be used by line-oriented frontends, who need to
        guess whether a block is complete or not based solely on prior and
        current input lines.  The InputSplitter considers it has a complete
        interactive block and will not accept more input only when either a
        SyntaxError is raised, or *all* of the following are true:

        1. The input compiles to a complete statement.

        2. The indentation level is flush-left (because if we are indented,
           like inside a function definition or for loop, we need to keep
           reading new input).

        3. There is one extra line consisting only of whitespace.

        Because of condition #3, this method should be used only by
        *line-oriented* frontends, since it means that intermediate blank lines
        are not allowed in function definitions (or any other indented block).

        If the current input produces a syntax error, this method immediately
        returns False but does *not* raise the syntax error exception, as
        typically clients will want to send invalid syntax to an execution
        backend which might convert the invalid syntax into valid Python via
        one of the dynamic IPython mechanisms.

Then there's this comment (*'s added for emphasis):
        # If we already have complete input and we're flush left, the answer
        # depends.  In line mode, if there hasn't been any indentation,
        # that's it. If we've come back from some indentation, we need
        # the blank final line to finish.
        # In cell mode, we need to check how many blocks the input so far
        # compiles into, because if there's already more than one full
        # independent block of input, then the client has entered full
        # 'cell' mode and is feeding lines that each is complete.  In this
        # case we should then keep accepting. The Qt terminal-like console
        # does precisely this, *to provide the convenience of terminal-like
        # input of single expressions, but allowing the user (with a
        # separate keystroke) to switch to 'cell' mode and type multiple
        # expressions in one shot*.

So my question is then - what do you think the best way to map this into 
our REPL running in VS is?  We already have logic for detecting if a statement 
is complete.  It's similar, but we never allow multiple complete statements in 
a row ('cell' mode I guess).  

We could go about supporting this a couple of different ways.  #1 is to simply 
model this behavior within PTVS.  We already have a parser, we're already 
looking for complete statements, we just need to tweak our behavior around 
rules #2 and #3.  

Or, option 2 we could send the text across the wire to IPython, and ask if IPython 
thinks that the input is complete.  That would allow IPython to be in charge, and if 
you evolve this behavior, or there's some way for users to provide their own line 
splitter, then we'd get the latest and greatest behavior.

Orthogonal to either of those choices is if we should expose a button somewhere 
which lets you quickly switch between 'line' and 'cell' mode, or if we should just
default to cell mode in our REPL when running in IPython mode.  And w/ approach
#2 we'd need to send the mode change over somehow too.

Do you guys have any thoughts on what you think we should do here?  I think #1 is 
probably actually easier to implement for us and result in a better user experience.  For 
example if the remote IPython process has become unresponsive or crashed the user 
will have a slow response time because we're querying for the complete input on the 
UI thread.  But option 2 might evolve nicer if this parsing ever changes.

Thoughts?