
Here is anidea to kick around: Evolve and eventually replace dict.setdefault with a more specialized method that is clearer, cleaner, easier to use, and faster. d.addlist(k, v) would work like d.setdefault(k, []).append(v) Raymond Hettinger
d = dict() for elem in 'apple orange ant otter banana asp onyx boa'.split(): ... k = elem[0] ... d.addlist(k, elem) ... d {'a': ['apple', 'ant', 'asp'], 'b': ['banana', 'boa'], 'o': ['orange', 'otter', 'onyx']}
bookindex = dict() for pageno, page in enumerate(pages): for word in page: bookindex.addlist(word, pageno)

On Tue, 20 Jan 2004, Raymond Hettinger wrote:
Here is anidea to kick around:
Evolve and eventually replace dict.setdefault with a more specialized method that is clearer, cleaner, easier to use, and faster.
d.addlist(k, v) would work like d.setdefault(k, []).append(v)
-1: I use .setdefault all the time with non-list default arguments. The dict interface is getting complex enough that adding more clutter will make it even harder to teach optimal idiomatic Python to newbies. -Kevin -- Kevin Jacobs - Enterprise Systems Architect | What is an Architect? He designs Consultant to The National Cancer Institute | a house for another to build Dept. of Cancer Epidemiology and Genetics | and someone else to inhabit. V: (301) 954-0726 E: jacobske@mail.nih.gov | -- William Kahan

On Tue, 2004-01-20 at 09:25, Kevin Jacobs wrote:
-1: I use .setdefault all the time with non-list default arguments.
The dict interface is getting complex enough that adding more clutter will make it even harder to teach optimal idiomatic Python to newbies.
I completely agree with both points. -Barry

Barry Warsaw wrote:
On Tue, 2004-01-20 at 09:25, Kevin Jacobs wrote:
-1: I use .setdefault all the time with non-list default arguments.
The dict interface is getting complex enough that adding more clutter will make it even harder to teach optimal idiomatic Python to newbies.
I completely agree with both points.
As a relative newbie (and not having done much coding in recent months), I have to say it took me a while to figure out what "d.setdefault(k, []).append(v)" actually did (I had previously only encountered dict.getdefault, and the meaning of dict.setdefault was not immediately obvious to me). If I saw "d.addlist(some_variable, some_other_variable)", I certainly would not automatically interpret it as "append some_other_variable to the value keyed by some_variable, creating that value as the empty list if it is not present". At least the current approach breaks this into two steps, and gave me a chance to figure it out without diving into the docs to find out what the method does. This does seem to be another example where a 'defaulting dictionary' with a settable factory method would seem to be useful. Then: dd = defaultingdict(factory=list) ...other code uses dict... dd[k].append(v) (As others have pointed out, this can't be a keyword argument on standard dictionaries without interfering with the current automatic population of the created dictionary with the keyword dictionary) Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

On Jan 20, 2004, at 9:17 AM, Raymond Hettinger wrote:
Here is anidea to kick around:
Evolve and eventually replace dict.setdefault with a more specialized method that is clearer, cleaner, easier to use, and faster.
d.addlist(k, v) would work like d.setdefault(k, []).append(v)
-1 There are other reasons to use setdefault. This one is pretty common though, but I think a more generic solution could be implemented. Perhaps: d.setdefault(k, factory=list).append(v) ? -bob

On Tue, 2004-01-20 at 09:26, Bob Ippolito wrote:
On Jan 20, 2004, at 9:17 AM, Raymond Hettinger wrote:
Here is anidea to kick around:
Evolve and eventually replace dict.setdefault with a more specialized method that is clearer, cleaner, easier to use, and faster.
d.addlist(k, v) would work like d.setdefault(k, []).append(v)
-1
There are other reasons to use setdefault. This one is pretty common though, but I think a more generic solution could be implemented.
Perhaps:
d.setdefault(k, factory=list).append(v) ?
d.setdefault(k, []).append(v) I'm not sure what any of the other suggestions buy you except avoiding a list instantiation, which doesn't seem like enough to warrant the extra complexity. I use setdefault() quite a bit (big surprise, huh?) and occasionally would like lazy evaluation of the second argument, but it usually doesn't bother me. -Barry

At 09:26 AM 1/20/04 -0500, Bob Ippolito wrote:
There are other reasons to use setdefault. This one is pretty common though, but I think a more generic solution could be implemented.
Perhaps:
d.setdefault(k, factory=list).append(v) ?
+100. :) An excellent replacement for my recurring use of: try: return self._somemapping[key] except: self._somemapping[key] = value = somethingExpensive(key) return value That becomes simply: return self._somemapping.setdefault( key, factory=lambda: somethingExpensive(key) )

At 09:56 AM 1/20/04 -0500, Phillip J. Eby wrote:
try: return self._somemapping[key] except: self._somemapping[key] = value = somethingExpensive(key) return value
Oops, that should've been except KeyError. Which actually points up another reason to want to use a setdefault mechanism, although admittedly not for the builtin dictionary type. Using exceptions for control flow can mask actual errors occuring within a component being used. Thus, when I create mapping-like objects, I try as much as possible to push "defaulting" into the lower-level mappings, to avoid needing to trap errors and turn them into defaults. Sometimes, however, that default is expensive to create, so I actually use a factory argument, and it's often named 'factory'. Hence, my enthusiasm for the suggestion.

This also mean that we add : d.addset(k,v) d.setdefault(k,Set()).add(v) d.adddict(k,subk,v) d.setdefault(k,{})[subk]=v I think we can't add a new method for each python datatype .. Raymond Hettinger wrote:
Here is anidea to kick around:
Evolve and eventually replace dict.setdefault with a more specialized method that is clearer, cleaner, easier to use, and faster.
d.addlist(k, v) would work like d.setdefault(k, []).append(v)
Raymond Hettinger
d = dict() for elem in 'apple orange ant otter banana asp onyx boa'.split():
... k = elem[0] ... d.addlist(k, elem) ...
d
{'a': ['apple', 'ant', 'asp'], 'b': ['banana', 'boa'], 'o': ['orange', 'otter', 'onyx']}
I'm sure here you'll be more interested in a addset rather than addlist :).
bookindex = dict() for pageno, page in enumerate(pages): for word in page: bookindex.addlist(word, pageno)
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/boris.boutillier%40arteris...

Here is anidea to kick around:
Evolve and eventually replace dict.setdefault with a more specialized method that is clearer, cleaner, easier to use, and faster.
d.addlist(k, v) would work like d.setdefault(k, []).append(v)
Raymond Hettinger
d = dict() for elem in 'apple orange ant otter banana asp onyx boa'.split(): ... k = elem[0] ... d.addlist(k, elem) ... d {'a': ['apple', 'ant', 'asp'], 'b': ['banana', 'boa'], 'o': ['orange', 'otter', 'onyx']}
bookindex = dict()
(What's wrong with {}?)
for pageno, page in enumerate(pages): for word in page: bookindex.addlist(word, pageno)
I'm -0 on the idea (where does it stop? when we have reimplemented Perl?), and -1 on the name (it suggests adding a list). It *is* a somewhat common idiom, but do we really need to turn all idioms into method calls? I think not -- only if the idiom is hard to read in its natural form. bookindex = {} for pageno, page in enumerate(pages): for word in page: lst = bookindex.get(word) if lst is None: bookindex[word] = lst = [] lst.append(pageno) works for me. (I think setdefault() was already a minor mistake -- by the time I've realized it applies I have already written working code without it. And when using setdefault() I always worry about the waste of the new empty list passed in each time that's ignored most times.) If you really want library support for this idiom, I'd propose adding a higher-level abstraction that represents an index: bookindex = Index() for pageno, page in enumerate(pages): for word in page: bookindex.add(word, pageno) What you're really doing here is inverting an index; maybe that idea can be leveraged? bookindex = Index() bookindex.fill(enumerate(pages), lambda page: iter(page)) This would require something like class Index: def __init__(self): self.map = {} def fill(self, items, extract): for key, subsequence in items: for value in extract(subsequence): self.map.setdefault(value, []).append(key) Hm, maybe it should take a key function instead: bookindex = Index() bookindex.fill(enumerate(pages), lambda x: iter(x[1]), lambda x: x[0]) with a definition of class Index: def __init__(self): self.map = {} def fill(self, seq, getitems, getkey): for item in seq: key = getkey(item) for value in getitems(item): self.map.setdefault(value, []).append(key) Hmm... Needs more work... I don't like using extractor functions. But I like addlist() even less. --Guido van Rossum (home page: http://www.python.org/~guido/)

[GvR]
bookindex = dict()
(What's wrong with {}?)
Nothing at all. {} is shorter, faster, and everyone understands it. When teasing out ideas at the interpreter prompt, I tend to use dict() because I find it easier to edit the line and add some initial values using dict(one=1, two=2, s='abc').
works for me. (I think setdefault() was already a minor mistake -- by the time I've realized it applies I have already written working code without it. And when using setdefault() I always worry about the waste of the new empty list passed in each time that's ignored most times.)
If you really want library support for this idiom ...
Not really. It was more of a time machine question -- if we had setdefault() to do over again, what would be done differently: * Keep setdefault(). * Drop it and make do with get(), try/except, or if k in d. * Martin's idea for dicts to have an optional factory function for defaults. * Have a specialized method that just supports dicts of lists. After all the discussions, the best solution to the defaulting problem appears to be some variant of Martin's general purpose approach: d = {} d.setfactory(list) for k, v in myitems: d[k].append(v) # dict of lists d = {} d.setfactory(set): for v in mydata: d[f(v)].add(v) # partition into equivalence classes d = {} d.setfactory(int): for v in mydata: d[k] += 1 # bag Raymond Hettinger

Raymond Hettinger writes:
After all the discussions, the best solution to the defaulting problem appears to be some variant of Martin's general purpose approach:
d = {} d.setfactory(list) for k, v in myitems: d[k].append(v) # dict of lists
Oooh, this could get scary: L = [] L.setfactory(MyClassWithCostlyConstructor) L[0] ;-) -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation

On Jan 20, 2004, at 1:32 PM, Raymond Hettinger wrote:
[GvR]
bookindex = dict()
(What's wrong with {}?)
Nothing at all. {} is shorter, faster, and everyone understands it.
When teasing out ideas at the interpreter prompt, I tend to use dict() because I find it easier to edit the line and add some initial values using dict(one=1, two=2, s='abc').
works for me. (I think setdefault() was already a minor mistake -- by the time I've realized it applies I have already written working code without it. And when using setdefault() I always worry about the waste of the new empty list passed in each time that's ignored most times.)
If you really want library support for this idiom ...
Not really. It was more of a time machine question -- if we had setdefault() to do over again, what would be done differently:
* Keep setdefault(). * Drop it and make do with get(), try/except, or if k in d. * Martin's idea for dicts to have an optional factory function for defaults. * Have a specialized method that just supports dicts of lists.
Here's another idea: how about adding a special method, similar in implementation to __getattr__, but for __getitem__ -- let's say it's called __getdefaultitem__? You could then subclass dict, implement this method, and you'd probably be able to do what you want rather efficiently. You may or may not decide to "cache" the value inside your custom __getdefaultitem__, and if you do, you may or may not want to change the behavior of __contains__ as well. # warning: untested code class newdict(dict): """Implements the proposed protocol""" def __getitem__(self, item): try: return super(newdict, self).__getitem__(item) except KeyError: return self.__getdefaultitem__(item) def __getdefaultitem__(self, item): raise KeyError(repr(item)) class listdict(newdict): def __getdefaultitem__(self, item): self[item] = rval = [] return rval -bob

On 2004 Jan 20, at 19:32, Raymond Hettinger wrote: ...
d = {} d.setfactory(list) for k, v in myitems: d[k].append(v) # dict of lists
d = {} d.setfactory(set): for v in mydata: d[f(v)].add(v) # partition into equivalence classes
d = {} d.setfactory(int): for v in mydata: d[k] += 1 # bag
Yes, except that a .factory property seems preferable to me to a .setfactory setter-method (which would have to come with .getfactory or equivalent if introspection, pickling etc are to work...) except perhaps for the usual "we don't have a built-in curry" issues (so .setfactory might carry arguments after the first [callable factory] one to perform the usual "ad hoc currying" hac^H^H^H idiom). In fact I'd _love_ this approach, were it not for the fact that in some use cases I'd like the factory to receive the key as its argument. E.g.: squares_of_ints = {} def swe_factory(k): assert isinstance(k, (int, long)) return k*k squares_of_ints.setfactory_receiving_key(swe_factory) Alex

Alex Martelli wrote:
On 2004 Jan 20, at 19:32, Raymond Hettinger wrote: ...
d = {} d.setfactory(list) for k, v in myitems: d[k].append(v) # dict of lists
d = {} d.setfactory(set): for v in mydata: d[f(v)].add(v) # partition into equivalence classes
d = {} d.setfactory(int): for v in mydata: d[k] += 1 # bag
Yes, except that a .factory property seems preferable to me to a .setfactory setter-method (which would have to come with .getfactory or equivalent if introspection, pickling etc are to work...) except perhaps for the usual "we don't have a built-in curry" issues (so .setfactory might carry arguments after the first [callable factory] one to perform the usual "ad hoc currying" hac^H^H^H idiom). In fact I'd _love_ this approach, were it not for the fact that in some use cases I'd like the factory to receive the key as its argument. E.g.:
squares_of_ints = {} def swe_factory(k): assert isinstance(k, (int, long)) return k*k squares_of_ints.setfactory_receiving_key(swe_factory)
Alex
This could be seen as favouring the protocol approach that Bob suggested. In that approach, the 'defaulting' is done by having dict access call __getdefaultitem__ if it exists, and the item being looked up is not found. The signature of __getdefaultitem__ is the same as that for __getitem__. If __getdefaultitem__ isn't found, then the current behaviour of raising KeyError is retained. I suspect the aim of this approach would be to reduce errors by avoiding rewriting the 'try/except KeyError' block everytime we wanted a defaulting dictionary. I imagine it could be made faster than the current "k in d" or "try...except Keyerror..." idioms, too. The more I think about it, the more I'm leaning towards the class-based approach - the version with 'factory' or 'setfactory' seems to lend itself to too many dangerous usages, especially: d = {} ...do some stuff... [A] d.factory = factory_func ...do some more stuff... [B] The meaning of d[k] is significantly different in section A than it was in section B. Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

d = {} ...do some stuff... [A] d.factory = factory_func ...do some more stuff... [B]
The meaning of d[k] is significantly different in section A than it was in section B.
For most any reasonable program, previous lines of execution determine the context of later lines. Saying that d[k] in section A is different than d[k] in section B is an assumption that all people make when writing useful programs in most languages. We can say the same thing about the meaning of d[1] in sections A and B of the following snippet. d = {} d[1] = 1 #A del d[1] #B While the above it is not good programming style, it does highlight the fact that the existance of dict.factory (or any equivalent behavior) doesn't remove meaning or expressiveness of the statements using, preceeding, or following it. Certainly dict.factory ends up overlapping with a portion of the functionality of dict.setdefault, but it has the potential for reducing the peppering of dict.setdefault(key, value) in some programs. As an aside, when I first started using Python, I thought (before using it) that dict.setdefault had the behavior of what dict.factory is suggested as having now. - Josiah

Josiah Carlson wrote:
The meaning of d[k] is significantly different in section A than it was in section B.
For most any reasonable program, previous lines of execution determine the context of later lines.
True. I guess it was more a matter of dictionaries acquiring a piece of 'magical state' which had fairly profound non-local effects on their behaviour. The dictionaries 'default value' just seems to be something that shouldn't change, whereas the actual data stored in the dictionary changes all the time. Still, I can't even convince myself that there's actually anything to be concerned about there, so that's the last I'll say about it. Regards, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268
participants (12)
-
Alex Martelli
-
Barry Warsaw
-
Bob Ippolito
-
Boris Boutillier
-
Fred L. Drake, Jr.
-
Guido van Rossum
-
Josiah Carlson
-
Kevin Jacobs
-
Nick Coghlan
-
Phillip J. Eby
-
Raymond Hettinger
-
Raymond Hettinger