[IPython-dev] cell magics

Sat Feb 16 17:07:04 EST 2013

On 2/16/13 12:47 AM, Fernando Perez wrote:
> Hi Jason,
>

>> PROPOSAL:
>>
>> Line and cell magics are normal python functions that take a string as
>> their first input.  The string is either the rest of the line, or if
>> there is no discernible string on the rest of the line, the string is
>> taken as the rest of the cell.  As an example:
>>
>> %timeit(runs=10) 2+3
>>
>> which gets translated to time("2+3", runs=10)
>>
>> A cell magic is exactly the same function, only it is invoked by putting
>> the string on the following lines, rather than on the same line, so the
>> string is taken as the remainder of the cell:
>>
>> %timeit(runs=10)
>> 2+3
>> 5+6
>>
>> which gets translated to timeit("2+3\n5+6", runs=10)
>>
>> Notice there is no distinction between line and cell magic
>> functions---the distinction entirely happens in how they are invoked
>> (whether the string is on the same line or on following lines).  Also,
>> magic options are passed in exactly the same way that arguments are
>> always passed to python functions, instead of having a non-pythonic,
>> bash-like syntax for options.

I've spent all day (and a better part of a night :) writing and 
rewriting this reply; hopefully it's clear.  Like I mentioned before, 
I'm still forming an opinion and working on seeing things clearly. 
Special thanks goes to some private comments from William for helping me 
to see more clearly what is a real difference between the two 
philosophies and what is an implementation detail.

To summarize, I think the two positions are:

Current IPython:
1a. % is only a valid operator on registered string decorators
2a. Two different syntaxes for invoking line and cell string decorators
3a. optional arguments to string decorators should not have pythonic syntax

In our proposal, we say:

1b. % should be a valid operator on any string decorator
2b. The % cell/line invocation syntax should be unified
3b. optional arguments to string decorators should have pythonic syntax

Is that a fair characterization?

Let's take these in turn:

1. where % is a valid operator
==============================

I think your argument boils down to "let's make % only apply to 
registered string decorators so that people will think in a certain way 
when they use %".  That may have held when you had tight control on 
exactly what was registered, but now with everyone and their mother 
implementing custom extensions doing who knows what, it's artificially 
restrictive to insist that % means "modify the editor or go beyond 
python".  Thomas points out that already the scope of % directives has 
massively increased beyond the original idea.  With custom extensions, 
the scope of what is accomplished and how a user thinks when they use % 
blows wide open.

With our proposal, %<tab> could still just complete with registered 
string decorators (with even the exact same registration system, so you 
can register things outside of the user's namespace).  The difference is 
that if I wanted to run a string decorator that wasn't in the registered 
namespace, but I had just defined in my user namespace, I wouldn't have 
to pollute the registered namespace for my one invocation.  I could just 
run %my_function directly.  I think this is particularly important 
because the registered % namespace is a flat list of names, so it's not 
a very organized namespace (which is exactly a problem you have with 
Sage's philosophy, ironically :).

2. Unification of user syntax for cell/line decorators
======================================================

You only address the fact that you have special ways to make it easy for 
a developer to define both a line and cell magic.  That's an 
implementation detail, and it's easier with our proposal anyway since 
everything is automatically both.  We see already that there are a 
number of decorators that could be both, but are only defined as one, 
like %ruby, for example.

But the real issue here is that IPython has two different syntaxes for 
the users to invoke string decorators.  That means that I, as a user, 
constantly have to remember if a command is one, or the other, or both, 
because the very first two characters are different between the two. 
For example, just now I typed %ru<tab>, and I see from the completions 
that ruby is only a cell decorator.  So now I have to backspace and put 
that extra % in there.  It also means that if a string decorator can be 
invoked as both, in order to change between the two, I have to not only 
put the string in a different place, but I also have to go back and 
modify the % part too.  If I forget to modify the %, I get very 
confusing results

With our proposal, these issues go away.  All string decorators are 
invoked with %.  The difference between a line or cell decorator is 
determined by whether or not there is a string on the line (just as the 
difference between

if blah: something

and

if blah:
     something
     something else

is (partly) just seen by looking to see if the block is on the same 
line).  So to change between cell and line string decorators, all I have 
to do is move the line.  For example:

%time some_function

now I want to add a few more things to the time run.  All I have to do 
is change where the string is:

%time
some_function
some_other function

and the lack of a string after %time tells me that I should look below 
%time for the string.  I don't have to constantly keep adjusting the % 
character(s).

P.S. Actually, after thinking about it more, since a single % is 
ambiguous syntax, I think I would prefer all string decorators be 
invoked with %%, which is invalid python syntax.  Then we won't have 
this problem:

cd=5
a=4\
%cd

or this problem:

cd=4
a="time: %d"\
%cd

3. format of optional arguments
===============================

It seems that there are two main arguments for this:

* backwards compatibility (with IPP and previous IPython versions, as 
well as IDL and matlab apparently?).  But I'll point out that we can 
easily support this too, in almost exactly the same invocation:

%timeit('-r 5 -and -other -options') 2+3

* also, you made the argument that it should be command-line type syntax 
because it is visually distinctive from python.  I think this argument 
presupposes that:

(a) users are comfortable with bash-like syntax.  With the rising 
popularity of IPython, and especially with the rising popularity of the 
notebook, I think we're going to see more and more non-unix users that 
see the bash-like syntax as one more *new* thing to learn, rather than 
something that is already familiar in a different context.  In fact, I 
look at the %timeit syntax for example, and I have to try to remember 
what the options are each time.  Compare:

%timeit -r5 2+3

%timeit(runs=5) 2+3

I think it's pretty clear which statement is more readable.  This focus 
on readability over brevity (remember, "Readability counts") is part of 
why python is so good in general.

I look at the %R option syntax, and I absolutely *have* to go to the 
help to see what is going on.  Supporting a pythonic syntax I think will 
help users tremendously.

Of course, this readability stuff is just a convention.  You could make

%timeit --runs=5 2+3

but the point is that encouraging a bash-like syntax is casting a vote 
for brevity over readability.  Considering that each extension will 
implement its own short options, I think I vote for readability, so I 
have a better chance of looking at a string decorator invocation and 
having an idea of what it is doing

(b) the bash-like syntax is just as powerful.  But it's certainly not. 
With a bash-like syntax that isn't valid python code, but just a string, 
I cannot pass in python objects directly.  I can do that if my 
parameters are valid python syntax.  This is a lesson IPython learned 
when it decided on a rich pyout/display_data/stream output system 
instead of just passing on a stdout string.  It makes sense to apply 
that same lesson here to the string decorator inputs.

(c) we need visually distinctive syntax because % directives are very 
different from python code.  But like I mention above with custom user 
extensions, I don't think this idea that % directives are command-line 
sorts of things applies as much anymore because the field is wide open. 
  Regardless, if we *have* to have bash syntax for some extension, it's 
easy to make an invocation like:

%script('-bg')
blah

I think I just convinced myself that the new proposal really is a better 
plan.

Whew!  Okay, so...comments?

Thanks,

Jason