[Python-Dev] defaultdict proposal round three
Ian Bicking
ianb at colorstudy.com
Mon Feb 20 22:13:23 CET 2006
Alex Martelli wrote:
>>I prefer this approach over subclassing. The mental load from an
>>additional
>>method is less than the load from a separate type (even a
>>subclass). Also,
>>avoidance of invariant issues is a big plus. Besides, if this allows
>>setdefault() to be deprecated, it becomes an all-around win.
>
>
> I'd love to remove setdefault in 3.0 -- but I don't think it can be
> done before that: default_factory won't cover the occasional use
> cases where setdefault is called with different defaults at different
> locations, and, rare as those cases may be, any 2.* should not break
> any existing code that uses that approach.
Would it be deprecated in 2.*, or start deprecating in 3.0?
Also, is default_factory=list threadsafe in the same way .setdefault is?
That is, you can safely do this from multiple threads:
d.setdefault(key, []).append(value)
I believe this is safe with very few caveats -- setdefault itself is
atomic (or else I'm writing some bad code ;). My impression is that
default_factory will not generally be threadsafe in the way setdefault
is. For instance:
def make_list(): return []
d = dict
d.default_factory = make_list
# from multiple threads:
d.getdef(key).append(value)
This would not be correct (a value can be lost if two threads
concurrently enter make_list for the same key). In the case of
default_factory=list (using the list builtin) is the story different?
Will this work on Jython, IronPython, or PyPy? Will this be a
documented guarantee? Or alternately, are we just creating a new way to
punish people who use threads? And if we push threadsafety up to user
code, are we trading a very small speed issue (creating lists that are
thrown away) for a much larger speed issue (acquiring a lock)?
I tried to make a test for this threadsafety, actually -- using a
technique besides setdefault which I knew was bad (try:except
KeyError:). And (except using time.sleep(), which is cheating), I
wasn't actually able to trigger the bug. Which is frustrating, because
I know the bug is there. So apparently threadsafety is hard to test in
this case. (If anyone is interested in trying it, I can email what I have.)
Note that multidict -- among other possible concrete collection patterns
(like Bag, OrderedDict, or others) -- can be readily implemented with
threading guarantees.
--
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Python-Dev
mailing list