[MATRIX-SIG] Changing shelve...

David J. C. Beach beach@verinet.com
Fri, 03 Oct 1997 18:08:20 -0600


Developers,

The Numeric module includes it's own special versions of Pickler and
Unpickler which inherit from those defined in the pickle module, but add
the functionality of Numeric.array pickling.  This is all well and good,
but it's still pretty troublesome if you're working with the shelve
module.  shelve.Shelf likes to directly call pickle.Pickler and
pickle.Unpickler, not those versions found in Numeric.

I'd like to propose a change to shelf which allows the user to say
something like:

import Numeric
import shelve
y = shelf.open("my_shelf_file", pickler=Numeric.pickler,
unpickler=Numeric.unpickler)
y['some_key'] = Numeric.array([[1,2,3],[4,5,6]])

...and so on.  In fact, I've created a modified version of shelve for
exactly this purpose.  The keywords pickler= and unpickler= must be set
to classes which are compatible with those defined in pickle.  Also, the
keywords are required, they're not named parameters.  I hoped that this
would avoid any confusion when calling shelf.open()...

Anyhow, I'm including my modified version of shelve.py at the end of
this message, (I hope that doesn't go against the nature of this list!)
Are there any chances of something like this being folded in to the 1.5
release?

-----------myshelve.py-------------------
 """Manage shelves of pickled objects.

A "shelf" is a persistent, dictionary-like object.  The difference
with dbm databases is that the values (not the keys!) in a shelf can
be essentially arbitrary Python objects -- anything that the "pickle"
module can handle.  This includes most class instances, recursive data
types, and objects containing lots of shared sub-objects.  The keys
are ordinary strings.

To summarize the interface (key is a string, data is an arbitrary
object):

        import shelve
        d = shelve.open(filename) # open, with (g)dbm filename -- no
suffix

        d[key] = data   # store data at key (overwrites old data if
                        # using an existing key)
        data = d[key]   # retrieve data at key (raise KeyError if no
                        # such key)
        del d[key]      # delete data stored at key (raises KeyError
                        # if no such key)
        flag = d.has_key(key)   # true if the key exists
        list = d.keys() # a list of all existing keys (slow!)

        d.close()       # close it

Dependent on the implementation, closing a persistent dictionary may
or may not be necessary to flush changes to disk.

--------------------------------
Modified by Dave Beach <beach@verinet.com> to add a Shelf.values()
method,
and to allow inclusion use of an arbitrary Pickler and Unpickler, not
necessairly those included in the pickle module.
The added parameters to Shelf.__init__(), BsdDbShelf.__init__(),
DbFileNameShelf.__init__(), and open are intentionally added with the
**kw
feature so that they can't be "accidentally" used by an unsuspecting
programmer (i.e., the keywords "pickler=" and "unpickler=" are required
explicitly.
"""

import pickle
import StringIO

#  modified lines of code are marked in the right hand margin
______________
#
`
#
\
#
V

class Shelf:
        """Base class for shelf implementations.

        This is initialized with a dictionary-like object.
        See the module's __doc__ string for an overview of the
interface.
        """

        pickler = [pickle.Pickler]
        unpickler = [pickle.Unpickler]

        def __init__(self, dict,
**kw):                                       #
                self.dict = dict
                for key in
kw.keys():                                         #
                    if key ==
'pickler':                                      #
                        self.pickler =
[kw['pickler']]                        #
                    elif key ==
'unpickler':                                  #
                        self.unpickler =
[kw['unpickler']]                    #
                    else: raise AttributeError,
key                           #

        def keys(self):
                return self.dict.keys()

        # this implementation can be a memory pig for a very large Shelf

        # a design similar to xrange would probably work better
        def
values(self):                                                     #
                values =
[]                                                   #
                for key in
self.keys():                                       #

values.append(self[key])                                  #
                return
values                                                 #

        def __len__(self):
                return len(self.dict)

        def has_key(self, key):
                return self.dict.has_key(key)

        def __getitem__(self, key):
                f = StringIO.StringIO(self.dict[key])
                return
self.unpickler[0](f).load()                            #

        def __setitem__(self, key, value):
                f = StringIO.StringIO()
                p =
self.pickler[0](f)                                        #
                p.dump(value)
                self.dict[key] = f.getvalue()

        def __delitem__(self, key):
                del self.dict[key]

        def close(self):
                if hasattr(self.dict, 'close'):
                        self.dict.close()
                self.dict = None

        def __del__(self):
                self.close()


class BsdDbShelf(Shelf):
        """Shelf implementation using the "BSD" db interface.

        The actual database is opened using one of thethe "bsddb"
modules
        "open" routines (i.e. bsddb.hashopen, bsddb.btopen or
bsddb.rnopen.)

        This class is initialized with the the database object
        returned from one of the bsddb open functions.

        See the module's __doc__ string for an overview of the
interface.
        """

        def __init__(self, dict,
**kw):                                       #
            pickler =
None                                                    #
            unpickler =
None                                                  #
            for key in
kw.keys():                                             #
                if key == 'pickler': pickler =
kw['pickler']                  #
                elif key == 'unpickler': unpickler =
kw['unpickler']          #
                else: raise AttributeError,
key                               #
            Shelf.__init__(self, dict, pickler=pickler,
unpicker=unpickler)   #

        def set_location(self, key):
             (key, value) = self.dict.set_location(key)
             f = StringIO.StringIO(value)
             return (key, pickle.Unpickler(f).load())

        def next(self):
             (key, value) = self.dict.next()
             f = StringIO.StringIO(value)
             return (key, pickle.Unpickler(f).load())

        def previous(self):
             (key, value) = self.dict.previous()
             f = StringIO.StringIO(value)
             return (key, pickle.Unpickler(f).load())

        def first(self):
             (key, value) = self.dict.first()
             f = StringIO.StringIO(value)
             return (key, pickle.Unpickler(f).load())

        def last(self):
             (key, value) = self.dict.last()
             f = StringIO.StringIO(value)
             return (key, pickle.Unpickler(f).load())


class DbfilenameShelf(Shelf):
        """Shelf implementation using the "anydbm" generic dbm
interface.

        This is initialized with the filename for the dbm database.
        See the module's __doc__ string for an overview of the
interface.
        """

        def __init__(self, filename, flag='c',
**kw):                         #
                import anydbm
                pickler =
None                                                #
                unpickler =
None                                              #
                for key in
kw.keys():                                         #
                    if key == 'pickler': pickler =
kw['pickler']              #
                    elif key == 'unpickler': unpickler =
kw['unpickler']      #
                    else: raise AttributeError,
key                           #
                Shelf.__init__(self, anydbm.open(filename,
flag),             #
                               pickler=pickler,
unpickler=unpickler)          #


def open(filename, flag='c',
**kw):                                           #
        """Open a persistent dictionary for reading and writing.

        Argument is the filename for the dbm database.
        See the module's __doc__ string for an overview of the
interface.
        """
        pickler =
None                                                        #
        unpickler =
None                                                      #
        for key in
kw.keys():                                                 #
            if key == 'pickler': pickler =
kw['pickler']                      #
            elif key == 'unpickler': unpickler =
kw['unpickler']              #
            else: raise AttributeError,
key                                   #
        return DbfilenameShelf(filename,
flag,                                #
                               pickler=pickler,
unpickler=unpickler)          #



_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________