[IPython-dev] History

Fri Feb 18 11:54:40 EST 2011

Hi,

On Thu, Feb 17, 2011 at 2:31 PM, MinRK <benjaminrk at gmail.com> wrote:

>>> Hmm, what about concurrent sessions?  (Versus "previous" sessions which
>>> have
>>> been quit.)
>>
>> Interesting idea, but more complicated to implement. You'd have to somehow
>> have separate shells signal their presence. I think this is a separate
>> consideration - I'm looking at what we can get for free by restructuring the
>> history.
>
> I think the question here might be: what would the history look like at
> startup if I've been running several sessions simultaneously, and I start a
> new one?
> <start IP0,IP1,IP2>
> IP0: foo='bar'
> IP1: a=1
> IP2: b=2
> IP1: c=3
> <close IP0>
> IP2: d=4
> <start IP3>
> i.e. what is the 'previous' session?  Is it 'most recently closed' or 'most
> recently started' or 'most recently executed'?  You seemed to suggest above
> that IP3's readline history at startup should go d,c,b,a,foo (as I continue
> to press up) Is that correct?  I like this approach, but there's also value
> in keeping related commands (same session) adjacent to each other, so I
> don't think it's an obvious choice.  In most real cases, this probably
> doesn't
> make much difference.

This is really the key problem to think about, and I'm not sure I have
a full solution even in my head yet.  For now, a few thoughts...

In a sense, the history problem can be seen similar to the evolution
of a git repository, where you start with the master branch (the
history of your first session ever).  If you close and open a single
session again (but no concurrent ones), then each of these operations
is akin to a new commit, just adding to the old history.  But if a
concurrent session is opened, this is like making a new branch.

However, from a user experience standpoint, we don't want to expose
this kind of complexity in everyday work: people shouldn't need to
think of doing git branch merges to manage their history :)

And in actual usage, you tend to think of your history as a linear
object running back in time.  But as we've seen clearly, it's not
linear, it's a DAG.  So the question is: how to best flatten it, so
that the user experience is automatic and shows a reasonable
linearization of the DAG, and yet it's useful and powerful (better
than today's behavior where a lot of clobbering happens from
concurrent sessions)?

If we start keeping all the sessions around, though, after a while it
will get very unwieldy to manage this, as we'll probably need to use
uuids to tag them, and tracking multiple sessions like this will be a
pain.  But we can't fully linearize on every save operation, because
we'd lose the ability to track open concurrent sessions.

I have to run to a talk soon... Just as a sketch of how it could work:
there's a notion of a master session (to borrow git's terminology
deliberately); on opening ipython, it marks it as in use (locks).  If
a second session is opened (it sees the active one as locked), a
branch is made, tagged with a uuid and timestamp.  On startup, if
there are branches available they get merged into the current view,
but grouped together with most recent available first.  And only up to
5 branches are allowed to be 'in flight', so that the system doesn't
grow too complicated.  We may want to think about a few commands for
users to manage this manually if they want...

Profiles could allow long-lived branches: there's a 'master'
per-profile, but otherwise the same logic as above applies.

Incomplete ideas as food for thought, but I have to run now...

>>> Are you kidding me?  One session equals roughly 1400 lines for me. :-)
>>>
>>> That's a good question, how much history should be available from
>>> "previous"
>>> sessions.
>>
>> But how much of that do you access through readline? It can of course be
>> user configurable, but we should set a sensible default. 40 is on the low
>> side. I assume readline is fairly fast: maybe a couple of hundred is a good
>> limit?
>
> I would think a default should be something like 500 at the very least.  I
> use ctrl-r all the time, and basically expect that if I remember typing it,
> IPython should remember as well.  If you are just using arrow-keys, then
> more than a few dozen isn't particularly valuable, but reverse-i-search
> should find anything I remember well enough to look for.

Yes, Ctrl-r should have O(1000) lines of history at least available to
it.  It's easy to use, fast, and very powerful.  Many people live and
die by it;l even if it's a somewhat more advanced readline feature,
for power users it's a very important one.

Cheers,

f