[Python-ideas] dict.setdefault_call(), or API variations thereupon

Anders Hovmöller boxed at killingar.net
Fri Nov 2 14:27:04 EDT 2018


Just a little improvement: you don't need the l local variable, you can just call append:

d.setdefault(foo, []).append(bar)

And correspondingly:
d[foo].append(bar)

> On 2 Nov 2018, at 17:52, Chris Barker via Python-ideas <python-ideas at python.org> wrote:
> 
>> On Thu, Nov 1, 2018 at 8:34 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>> The bottom line is, if I understand your proposal, the functionality 
>> already exists. All you need do is subclass dict and give it a 
>> __missing__ method which does what you want.
> 
> or subclass dict and give it a "setdefault_call") method :-)
> 
> But as I think Guido wasa pointing out, the real difference here is that DefaultDict, or any other subclass, is specifying what the default callable is for the entire dict, rather than at time of use. Personally, I'm pretty sure I"ve only used one default for any given dict, but I can imaige the are use cases for having different defaults for the same dict depending on context.
> 
> As for the OP's justification:
> 
> """
> If it's not clear, the purpose is to eliminate the overhead of creating an empty list or similar in situations like this:
> 
> d = {}
> for i in range(1000000):  # some large loop
>      l = d.setdefault(somekey, [])
>      l.append(somevalue)
> 
> # instead...
> 
> for i in range(1000000):
>     l = d.setdefault_call(somekey, list)
>     l.append(somevalue)
> 
> """
> 
> I presume the point is that in the first case, somekey might be often the same, and setdefault requires creating an actual empty list even if  the key is alredy there. whereas case 2 will only create the empty list if the key is not there. doing some timing with defaultdict:
> 
> In [19]: def setdefault():
>     ...:     d = {}
>     ...:     somekey = 5
>     ...:     for i in range(1000000):  # some large loop
>     ...:         l = d.setdefault(somekey, [])
>     ...:         l.append(i)
>     ...:     return d
> 
> In [20]: def default_dict():
>     ...:     d = defaultdict(list)
>     ...:     somekey = 5
>     ...:     for i in range(1000000):  # some large loop
>     ...:         l = d[somekey]
>     ...:         l.append(i)
>     ...:     return d
> 
> In [21]: % timeit setdefault()
> 185 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
> 
> In [22]: % timeit default_dict()
> 128 ms ± 1.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
> 
> so yeah, it's a little more performant, and I suppose if you were using a more expensive constructor, it would make a lot more difference. But then, how much is it likely to matter in a real use cases -- this was 1 million calls for one key and you got a 50% speed up -- is that common?
> 
> So it seems this would give us slightly better performance than .setdefault() for the use cases where you are using more than one default for a given dict.
> 
> BTW:
> 
> +1 for a mention of defaultdict in the dict.setdefault docs -- you can't do everything with defaultdict that you can with setdefault, but it is a very common use case.
> 
> -CHB
> 
> -- 
> 
> Christopher Barker, Ph.D.
> Oceanographer
> 
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> 
> Chris.Barker at noaa.gov
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/81aa09d3/attachment.html>


More information about the Python-ideas mailing list