[IPython-dev] History

Fri Feb 18 14:14:18 EST 2011

On 18 February 2011 15:48, Fernando Perez <fperez.net at gmail.com> wrote:

> Instant saving has one problem: frequent disk usage prevents hard
> drives from spinnning down when on battery.  The idea of an auto-save
> thread on a timer with a user-controllable delay has the advantage
> that the user can control their power consumption profile to fit their
> needs.
>

In the default configuration, I think we already have disk access on each
command as it's written to the shadow history. Admittedly that can be
disabled without affecting the main function of the program.

We could write a powersaver configuration, which disables the immediate
writes to the database, but registers an atexit handler to squirt the
session in when IPython is closed. I certainly want to cater for people
trying to eke out every last bit of battery life, but at the same time I
don't want all the users who aren't relying on a battery to miss out on what
I hope is a simpler, more powerful design for history.

Summarising your other points:
>Store translated history
Indeed - I've already changed my mind on this thanks to comments from other
people. My prototype is storing both raw and translated history

>Advantages of SQLite?
Robert's given a few. While JSON is a great tool, I don't think it's the
"one obvious way to do it" here - you have to parse it all to read any of
it, which is slow for a large file, and can lose your entire history if one
byte gets out of place. Neither is pickleshare, which is essentially a
key-value database, not designed to store a sequence. SQLite can essentially
combine what we're trying to do with both, and do it "right".

Also, I don't think the way we're reloading the history at present is tidy.
It gets lumped in with the current session, which causes the indexing
problems I already discovered. I've worked round those with session_offset,
but it's really a workaround. I could rethink the saving/reloading with
JSON, but when I tried thinking about it, I just decided JSON wasn't the
best tool for the job.

>Input processing
Thanks for the info :)

>Ordering multiple sessions:
While it's worth some thought, let's not lose too much sleep over the order.
As we've determined, if you want to go more than a few lines back in the
history, you search it somehow, in which case order doesn't matter much. I
never actually knew about Ctrl-R: I just type the first few letters of what
I want, then press up.

There are all sorts of options we could have for named sessions, for
repeating previous sessions, and so on, but I've not explored any of that
yet. As before, profiles have, for now, entirely separate history files - I
don't know if we want to keep that, or integrate them into one to allow
accessing history from other profiles.

>Tracking concurrent sessions:
Sessions can clearly leave some signal of their existence on disk, but if
they crash without clearing it up, other shells will see a zombie shell
still there. I don't think there's a payoff worth the trouble of tracking
'live' shells: we can just share the database with whatever other shells are
around.

>Readline history - 1000 lines
OK, I've made the prototype load 1000 lines by default. We can always up
that if we want.

(Robert/Brian) > Notebooks
[Snip] OK, Robert's clarified now that he meant something more like a
logbook than the UI option we've called a notebook.

Robert: storing the output is a bit trickier. We can store the repr of it
easily enough, which might be what you want. We could pickle output items
and store them, but that could quickly make the database file very large. If
you want to save a session with output, I believe you can use "%hist -of
session.txt".

(Brian) > Consistent API, alternative backends

Parts of the history_manager API can easily be abstracted - like
store_inputs and get_history. But there's a more conceptual difference in
what I'm doing, too: I'm getting rid of the idea of saving and loading
history, in favour of a model where it's immediately persisted. We can
expand that to include delayed persistence (as I've suggested above for
extending battery life), but we have to choose whether our API is written in
terms of "save" and "load", or in terms of "push cache to persistent store".

Also, while I'm all in favour of clean, well documented APIs, I think
storing history is the sort of thing that we as the creators should get
right, and users should never have to worry about. If someone wants to write
their own history manager for some reason, that's fine, but I don't think we
need to design around the assumption that people will. Let's focus on
writing a history system, not on an interface for possible history systems.

Whew! This seems to be a thread of epic e-mails. Thanks everyone for your
thoughts, and I hope I've answered any questions people had.

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20110218/de8c257b/attachment.html>