[Python-Dev] Add a frozendict builtin type

Victor Stinner victor.stinner at gmail.com
Tue Feb 28 00:34:08 CET 2012


>> The blacklist implementation has a major issue: it is still possible
>> to call write methods of the dict class (e.g. dict.set(my_frozendict,
>> key, value)).
>
> It is also possible to use ctypes and violate even more invariants.
> For most purposes, this falls under "consenting adults".

My primary usage of frozendict would be pysandbox, a security module.
Attackers are not consenting adults :-)

Read-only dict would also help optimization, in the CPython peephole
or the PyPy JIT.

In pysandbox, I'm trying to replace __builtins_ and (maybe also
type.__dict__) by a frozendict. These objects rely on PyDict API and
so expect a type "compatible" with dict. But PyDict_GetItem() and
PyDict_SetItem() may use a test like isinstance(obj, (dict,
frozendict)), especially if the C strucure is "compatible". But
pysandbox should not drive the design of frozendict :-)

>> The whitelist implementation has an issue: frozendict and dict are not
>> "compatible", dict is not a subclass of frozendict (and frozendict is
>> not a subclass of dict).
>
> And because of Liskov substitutability, they shouldn't be; they should
> be sibling children of a basedict that doesn't have the the mutating
> methods, but also doesn't *promise* not to mutate.

As I wrote, I realized that it doesn't matter if dict doesn't inherit
from frozendict.

>>  * frozendict values must be immutable, as dict keys
>
> Why?  That may be useful, but an immutable dict whose values
> might mutate is also useful; by forcing that choice, it starts
> to feel too specialized for a builtin.

If values are mutables, the frozendict cannot be called "immutable".
tuple and frozenset can only contain immutables values.

All implementations of frozendict that I found expect frozendict to be hashable.

>>  * frozendict.__hash__ computes hash(frozenset(self.items())) and
>> caches the result is its private hash attribute
>
> Why?  hash(frozenset(selk.keys())) would still meet the hash contract,
> but it would be approximately twice as fast, and I can think of only
> one case where it wouldn't work just as well.

Yes, it would faster but the hash is usually the hash of the whole
object content. E.g. the hash of a tuple is not the hash of items with
odd index, whereas such hash function would also meet the "hash
contract".

All implementations of frozendict that I found all use items, and not
only values or only keys.

Victor


More information about the Python-Dev mailing list