dict.reserve and other tricks

Klaas mike.klaas at gmail.com
Fri Nov 17 00:55:23 EST 2006


bearophileHUGS at lycos.com wrote:
> I have started doing practice creating C extensions for CPython, so
> here are two ideas I have had, possibly useless.
>
> If you keep adding elements to a CPython dict/set, it periodically
> rebuilds itself. So maybe dict.reserve(n) and a set.reserve(n) methods
> may help, reserving enough (empty) memory for about n *distinct* keys
> the programmer wants to add to the dict/set in a short future. I have
> seen that the the C API of the dicts doesn't allow this, and I don't
> know if this can be implemented modifying the dicts a bit. Do you think
> this may be useful?

It has been proposed before and rejected.  How often is dict creation a
bottleneck in python apps?  I'd guess not often.  Do you know of any
examples

Optimize space use also isn't terribly compelling, as a "tight" dict
can be created from a "loose" dict d using dict(d).

<>
> Most of the times such functions are good enough, but sometimes the
> dicts are big, so to reduce memory used I remove keys in place:
>
> def filterdict(pred, indict):
>   todel = [k for k,v in indict.iteritems() if not pred(k,v)]
>   for key in todel:
>     del indict[key]

<>
> But doing the same thing while iterating on the dict may be faster and
> use even less memory.

Well, you can reduce the memory usage to virtually nothing by using a
generator expression rather than list comprehension.

> This iteration&deletion capability is probably not safe enough to be
> used inside Python programs, but a compiled C module with a function
> that works like that filterdict (and deletes while iterating) may be
> created, and its use is safe from Python programs.

<>

> >The dictionary p should not be mutated during iteration. It is safe (since Python 2.1) to modify the values of the keys as you iterate over the dictionary, but only so long as the set of keys does not change.<
>
> Do you think it may be useful to create to create such C filterdict
> function that deletes while iterating? (To create such function it
> probably has to bypass the CPython dict API).

Such a beast would be fiendish to write, I think.  Remember, arbitrary
python code can be executed by __hash__ and deleting (DECREF) python
objects.

-Mike




More information about the Python-list mailing list