[IPython-dev] Prefiltering input - where and how?

Fernando Perez fperez.net at gmail.com
Sat Mar 12 16:57:28 EST 2011

Hi Thomas,

On Sat, Mar 12, 2011 at 10:45 AM, Thomas Kluyver <takowl at gmail.com> wrote:
> Hi all,
> Robert found that calling macros caused problems with my new history system,
> and suggested that they be expanded by prefiltering, so that the translated
> history stores the macro content each time it's called. My attempt to do
> that is here:
> https://github.com/takluyver/ipython/tree/expand-macro
> However, I ran into trouble with the execution model - it seems to compile
> the macro in "single" mode, so only the first line gets executed. So I
> peered into the execution code, and ended up somewhat confused. I'm hoping
> someone can clarify:

Ah, welcome to some of our prettiest code ;)

I'll do my best to explain, and any improvements on this front will be
very much welcome.  Things are somewhat cleaner than they used to be,
but I think with one more round of work we'll have finally a fairly
comprehensible system.

> - How much preprocessing is (and should be) done by IPython.core.prefilter
> versus IPython.core.inputsplitter?

prefilter is the OLD code, going all the way back to IPython's
beginning, where all processing was done in-line as the user typed it.
 inputsplitter was a recent attempt at rationalizing input handling,
that by and large has succeeded (even if it can still use some
cleanup).  The idea in inputsplitter is to first apply a set of static
transformations that are independent of anything in the user's
namespace, and which can all be tested/validated in a standalone
fashion.  This includes all explicitly escaped magics and similar
things, even in multiline blocks.  The old prefilter code was purely
line-oriented and very stateful, hence very difficult to test in

> - As far as I can see in the code, only a single line statement actually
> gets passed to prefilter. Yet prefiltering still seems to be going on if I
> enter a multiline block - and I'm not quite sure how

Yes, all multiline blocks are transformed by inputsplitter.  If
there's a single-line input, it does get fed to prefilter, though,
since only single-line statements are allowed to execute things like
magics without prefixes.

The idea was: for single-line statements, we want the convenience of
things like 'run foo' without the '%' prefix.  But that requires
checking the current namespace, so that must be done by a method of
the actual engine that has the namespace, and that's what the old
prefilter code does.

All multiline statements are *only* processed by the inputsplitter
code, which has a reasonable test suite and can be validated in
isolation, as it is completely specified in terms of static
transformations that do not depend on any namespace.

> - Should prefiltering be in the frontend or the core? I'm guessing the core,
> but there's some prefiltering code in the terminal frontend, although I
> think it's dead, because:

Yes, prefiltering should be done in the core, because that's where the
actual namespace is.  For all we know, a frontend might not even be
python (i.e. JavaScript in a browser), and we don't want to add
communication calls for prefiltering, I think that would be
unnecessary added complexity.

> - Why does TerminalInteractiveShell have two raw_input methods defined? I'm
> guessing the first one is older code: can I delete it to simplify matters?
> It's still there in version control if we need to refer to it.

Yup, the older one is leftover, feel free to clean it.

Ultimately we do want to:

- improve inputsplitter further (it's still a bit convoluted, though
at least well specified and tested).
- shrink prefilter to the absolute minimum.  In the interest of
caution, when we wrote the inputsplitter code we left most of the old
stuff in place, but much of that is likely now unused.  There should
only be a tiny amount of code that applies the dynamic transforms all
in one place and nothing else.

This code has been refactored but we've tiptoed around deleting
anything, because it's so much at the heart of the execution loop.
But it's been a few months since we've been using the new system
mostly without problems, so I think now it's OK to be a bit more
aggressive in removing anything you see as good cleanup candidate.

I'm working this weekend on accumulated things for my day job, but
feel free to ping me if you need to meet on IRC/skype for short
questions on any of this, I don't want to stall you out.



More information about the IPython-dev mailing list