persistent deque (continued)

castironpi castironpi at gmail.com
Sun Jul 27 00:59:08 EDT 2008


On Jul 21, 5:20 pm, Raymond Hettinger <pyt... at rcn.com> wrote:
> On Jul 21, 12:08 pm, castironpi <castiro... at gmail.com> wrote:
>
> > Some time ago, I was asking about the feasibility of a persistent
> > deque, a double-ended queue.
>
> > It runs into the typical space allocation problems.  
>
> Try starting with a dict-based implementation of a double-ended queue
> (http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/259179)
> and then replace the dict with a shelf.  Presto, you've got a
> persistent deque.
>
> Raymond

We have the cookbook recipe Raymond brings up, and an automatic
pickler that inhahe brought up in May.

I've initiated the approach to the solution I have in mind, but the
dilemma comes early and isn't pleasant.  I can either (i) recompile
the whole interpreter, replacing malloc with a custom function that
allocates from disk; or (ii) duplicate every class I want to be
persistable, making every change to the allocations by hand.

Disk-resident objects can clearly not contain references to any
objects not on disk (arguably at finalization-stage only), and
references to ones that are need to be wrapped by offset, and 'made
live' at initialization-time, so that references to the mapped file
resume succeeding after termination and restart--- the mapped memory
segment won't be the same from run to run.

Further, if I can't hijack the existing arena-pool model native in
Python, it would take writing a dedicated memory manager.

Assuming the whole memory isn't moved to disk, which is feasible, the
usage would look like this, to ensure that 'disk-resident' objects
don't refer to 'live' ones.

listA= perst( [ perst( 0 ), perst( 1 ), perst( "abc" ) ] )

to create the list [ 0, 1, "abc" ].  Modifications, e.g. append "def",
would look like this:

listA.append( perst( "def" ) )

and would execute the addition directly to mapped memory.  The memory
file would then have five objects: 0, 1, "abc", "def", and this list.

It is not clear that normal assignments would still work, self.b= c,
even for disk-resident c, since on a later run, 'self.b' would refer
to a different place in memory.  It might take either trapping updates
to self.__dict__, or something deeper at the level of the ast and
Assm't statements.  Or an extra level of indirection in accessing it.

>>> self.b
<C instance at index 12 in mem-mapped file 'data.dat'>

The indirection would consist of an offset-finder and a delegation
layer.

>>> self.b.append( perst( set( ) ) )

It would first have to look-up 'self.b', as opposed to accessing it
directly by memory address.

What I want from the group is to know if I've assessed the options for
a model of persistence accurately.  Or if I haven't depicted what I'm
going for clearly.  It seems enormous and if it is this hard, I doubt
I'll produce anything other than a few correct persistent types.



More information about the Python-list mailing list