[Python-3000] Immutable bytes type and dbm modules

Tue Aug 7 06:53:32 CEST 2007

> I guess we have to rethink our use of these databases somewhat.

Ok. In the interest of progress, I'll be looking at coming up with
some fixes for the code base right now; as we agree that the
underlying semantics is bytes:bytes, any encoding wrappers on
top of it can be added later.

> Perhaps the StringKeys and/or StringValues wrappers can be
> generalized? Or perhaps we could borrow from io.open(), and use a
> combination of the mode and the encoding to determine how to stack
> wrappers.

I thought about this, and couldn't think of a place where to put
them. Also, the bsddb versions provide additional functions
(such as .first() and .last()) which don't belong to the dict
API.

Furthermore, for dumbdbm, it would indeed be better if the dumbdbm
object knew that keys are meant to be strings. It could support
that natively - although not in a binary-backwards compatible
manner with 2.x. Doing so would be more efficient in the
implementation, as you'd avoid recoding.

> Another approach might be to generalize shelve. It already supports
> pickling values. There could be a few variants for dealing with keys
> that are either strings or arbitrary immutables; the keys used for the
> underlying *dbm file would then be either an encoding (if the keys are
> limited to strings) or a pickle (if they aren't). (The latter would
> require some kind of canonical pickling version, so may not be
> practical; there also may not be enough of a use case to bother.)

My concern is that people need to access existing databases. It's
all fine that the code accessing them breaks, and that they have
to actively port to Py3k. However, telling them that they have to
represent the keys in their dbm disk files in a different manner
might cause a revolt...

Regards,
Martin