persistent composites
kindly
kindly at gmail.com
Sun Jun 14 10:54:12 EDT 2009
On Jun 14, 3:27 pm, Aaron Brady <castiro... at gmail.com> wrote:
> Hi, please forgive the multi-posting on this general topic.
>
> Some time ago, I recommended a pursuit of keeping 'persistent
> composite' types on disk, to be read and updated at other times by
> other processes. Databases provide this functionality, with the
> exception that field types in any given table are required to be
> uniform. Python has no such restriction.
>
> I tried out an implementation of composite collections, specifically
> lists, sets, and dicts, using 'sqlite3' as a persistence back-end.
> It's significantly slower, but we might argue that attempting to do it
> by hand classifies as a premature optimization; it is easy to optimize
> debugged code.
>
> The essentials of the implementation are:
> - each 'object' gets its own table.
> = this includes immutable types
> - a reference count table
> = when an object's ref. count reaches zero, its table is dropped
> - a type-map table
> = maps object's table ID to a string of its type
> - a single 'entry point' table, with the table ID of the entry-point
> object
> = the entry point is the only data structure available to new
> connections. (I imagine it will usually be a list or dict.)
>
> I will be sure to kill any interest you might have by now, by
> "revealing" a snippet of code.
>
> The object creation procedure:
>
> def new_table( self, type ):
> ''' 'type' is a string, the name of the class the object is an
> instance of '''
> cur= self.conn.cursor( )
> recs= cur.execute( '''SELECT max( tableid ) FROM refcounts''' )
> rec= cur.fetchone( )
> if rec[ 0 ] is None:
> obid= 0
> else:
> obid= rec[ 0 ]+ 1
> cur.execute( '''INSERT INTO types VALUES( ?, ? )''', ( obid,
> type ) )
> cur.execute( '''INSERT INTO refcounts VALUES( ?, ? )''', ( obid,
> 1 ) )
>
> The increment ref. count procedure:
>
> def incref( self, obid ):
> cur= self.conn.cursor( )
> recs= cur.execute( '''SELECT count FROM refcounts WHERE tableid
> = ?''', ( obid, ) )
> rec= recs.fetchone( )
> newct= rec[ 0 ]+ 1
> cur.execute( '''UPDATE refcounts SET count = ? WHERE tableid = ?''',
> ( newct, obid ) )
>
> The top-level structure contains these two procedures, as well as
> 'setentry', 'getentry', and 'revive' procedures.
>
> Most notably, user-defined types are possible. The dict is merely a
> persistent dict. 'revive' checks the global namespace by name for the
> original type, subject to the same restrictions that we all know and
> love that 'pickle' has.
>
> As usual, deadlocks and cyclic garbage pose the usual problems. The
> approach I used last time was to maintain a graph of acquired locks,
> and merely check for cycles to avert deadlocks, which would go in a
> separate table. For garbage, I can't find a better solution than
> Python already uses.
>
> From the 3.0 docs:
> gc.garbage
>
> A list of objects which the collector found to be unreachable but
> could not be freed (uncollectable objects).
> ...
> Python doesn’t collect such [garbage] cycles automatically because, in
> general, it isn’t possible for Python to guess a safe order in which
> to run the __del__() methods. If you know a safe order, you can force
> the issue by examining the garbage list, and explicitly breaking
> cycles due to your objects within the list.
>
> Before I go and flesh out the entire interfaces for the provided
> types, does anyone have a use for it?
I like it as a concept, have not got any use for it this minute, but I
am sure it will be useful someday.
More information about the Python-list
mailing list