Re: [Python-ideas] Have dict().update() return its own reference.
Thanks, that's fair, for consistency. One use case for my question was a stackoverflow question regarding merging two dict's. If update() returned its own reference, and if we explicitly wanted a copy (instead of an in-place modification), we could have used dict(x).update(y) given x and y are both dict() instances. Cheers, Xav On 20 April 2012 22:35, Laurens Van Houtven <_@lvh.cc> wrote:
As a general rule, methods/functions in Python either *mutate* or *return*. (Obviously, mutating methods also return, they just return None)
For example: random.shuffle shuffles in place so doesn't return anything list.sort sorts in place so doesn't return anything sorted creates a new sorted thing, so returns that sorted thing
cheers lvh
On 20 Apr 2012, at 14:32, Xavier Ho wrote:
Hello,
What's the rationale behind the fact that `dict().update()` return nothing? If it returned the dictionary reference, at least we could chain methods, or assign it to another variable, or pass it into a function, etc..
What's the design decision made behind this?
Cheers, Xav
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On 2012-04-20, at 14:37 , Xavier Ho wrote:
Thanks, that's fair, for consistency.
One use case for my question was a stackoverflow question regarding merging two dict's. If update() returned its own reference, and if we explicitly wanted a copy (instead of an in-place modification), we could have used
dict(x).update(y)
given x and y are both dict() instances.
If you start from dict instances, you could always use: merged = dict(x, **y)
On 2012-04-20, at 14:48 , Xavier Ho wrote:
On 20 April 2012 22:47, Masklinn <masklinn@masklinn.net> wrote:
If you start from dict instances, you could always use:
merged = dict(x, **y)
I heard that Guido wasn't a fan of this.
Works to merge two dicts in a single expression, if you don't want to define a wrapper function and find a name for it.
Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200:
If you start from dict instances, you could always use:
merged = dict(x, **y)
No, not always. Only if all keys of `y` are strings (and probably they should also be valid Python identifiers.) Cheers, Sven
Sven Marnach, 20.04.2012 15:37:
Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200:
If you start from dict instances, you could always use:
merged = dict(x, **y)
No, not always. Only if all keys of `y` are strings (and probably they should also be valid Python identifiers.)
Also, it's not immediately clear from the expression what happens for duplicate keys, and the intended behaviour for that case may be different from what the above does. Stefan
On 2012-04-20, at 16:28 , Stefan Behnel wrote:
Sven Marnach, 20.04.2012 15:37:
Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200:
If you start from dict instances, you could always use:
merged = dict(x, **y)
No, not always. Only if all keys of `y` are strings (and probably they should also be valid Python identifiers.)
Also, it's not immediately clear from the expression what happens for duplicate keys
Not sure why, as with `dict.update` `dict` is defined as setting from the first argument, then setting from the keyword arguments (overriding keys originally set if any). Now of course that might not be obvious to people who don't know how dict works, but I fail to see why an other function which they don't know either will be any more "immediately clear". You may counter that a function taking (and merging) a sequence of mappings would "obviously" apply a left fold in merging the mappings, but in that case the dict constructor would "obviously" copy the positional then apply the keywords (which are after the positional). Which is exactly what happens.
On Fri, Apr 20, 2012 at 9:37 AM, Sven Marnach <sven@marnach.net> wrote:
If you start from dict instances, you could always use:
merged = dict(x, **y)
No, not always. Only if all keys of `y` are strings (and probably they should also be valid Python identifiers.)
a = {} b = {1:2} dict(a, **b) {1: 2}
Alexander Belopolsky, 20.04.2012 16:35:
On Fri, Apr 20, 2012 at 9:37 AM, Sven Marnach wrote:
If you start from dict instances, you could always use:
merged = dict(x, **y)
No, not always. Only if all keys of `y` are strings (and probably they should also be valid Python identifiers.)
a = {} b = {1:2} dict(a, **b) {1: 2}
That's no guaranteed behaviour, though. It doesn't work in PyPy, for example:
a={} b={1:2} dict(a,**b) Traceback (most recent call last): File "<console>", line 1, in <module> TypeError: keywords must be strings
(and, no, it's not PyPy that's wrong here) Stefan
On Fri, Apr 20, 2012 at 10:49 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
a = {} b = {1:2} dict(a, **b) {1: 2}
That's no guaranteed behaviour, though. It doesn't work in PyPy, for example.
I seem to recall that CPython had a similar limitation in the past, but it was removed at some point. I will try to dig out the relevant discussion, but I think the consensus was that ** should not attempt validate the keys.
Alexander Belopolsky schrieb am Fri, 20. Apr 2012, um 11:00:28 -0400:
I seem to recall that CPython had a similar limitation in the past, but it was removed at some point. I will try to dig out the relevant discussion, but I think the consensus was that ** should not attempt validate the keys.
It's the other way around. Your code used to work in Python 2.x, but it doesn't work in Python 3.x. Cheers, Sven
On Fri, Apr 20, 2012 at 8:00 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Fri, Apr 20, 2012 at 10:49 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
a = {} b = {1:2} dict(a, **b) {1: 2}
That's no guaranteed behaviour, though. It doesn't work in PyPy, for example.
I seem to recall that CPython had a similar limitation in the past, but it was removed at some point. I will try to dig out the relevant discussion, but I think the consensus was that ** should not attempt validate the keys.
The should be strings. There is no requirement that they are valid identifiers. -- --Guido van Rossum (python.org/~guido)
On 2012-04-20, at 20:30 , Victor Varvariuc wrote:
a = {} b = {1:2} dict(a, **b)
If b is a huge dict - not a good approach
If they're huge mappings, you probably don't want to go around copying them either way[0] and would instead use more custom mappings, either some sort of joining proxy or something out of Okasaki (a clojure-style tree-based map with structural sharing for instance) [0] I'm pretty sure "being fast to copy when bloody huge" is not at the forefront of Python's dict priorities.
On 20 April 2012 20:49, Masklinn <masklinn@masklinn.net> wrote:
If they're huge mappings, you probably don't want to go around copying them either way[0] and would instead use more custom mappings, either some sort of joining proxy or something out of Okasaki (a clojure-style tree-based map with structural sharing for instance)
Python 3.3 has collections.ChainMap for this sort of case. Paul.
On 2012-04-20, at 23:41 , Paul Moore wrote:
On 20 April 2012 20:49, Masklinn <masklinn@masklinn.net> wrote:
If they're huge mappings, you probably don't want to go around copying them either way[0] and would instead use more custom mappings, either some sort of joining proxy or something out of Okasaki (a clojure-style tree-based map with structural sharing for instance)
Python 3.3 has collections.ChainMap for this sort of case.
Yeah, it's an example of the "joining proxy" thing. Though I'm not sure I like it being editable, or the lookup order when providing a sequence of maps (I haven't tested it but it appears the maps sequence is traversed front-to-back, I'd have found the other way around more "obvious", as if each sub-mapping was applied to a base through an update call). An other potential weirdness of this solution — I don't know how ChainMap behaves there, the documentation is unclear — is iteration over the map and mapping.items() versus [(key, mapping[key]) for key in mapping] potentially having very different behaviors/values since the former is going to return all key:value pairs but the latter is only going to return the key:(first value for key) pairs which may lead to significant repetitions any time a key is present in multiple contexts.
Masklinn wrote:
On 2012-04-20, at 23:41 , Paul Moore wrote:
On 20 April 2012 20:49, Masklinn <masklinn@masklinn.net> wrote:
If they're huge mappings, you probably don't want to go around copying them either way[0] and would instead use more custom mappings, either some sort of joining proxy or something out of Okasaki (a clojure-style tree-based map with structural sharing for instance) Python 3.3 has collections.ChainMap for this sort of case.
Yeah, it's an example of the "joining proxy" thing. Though I'm not sure I like it being editable, or the lookup order when providing a sequence of maps (I haven't tested it but it appears the maps sequence is traversed front-to-back, I'd have found the other way around more "obvious", as if each sub-mapping was applied to a base through an update call).
ChainMap is meant to emulate scoped lookups, e.g. builtins + globals + nonlocals + locals. Hence, newer scopes mask older scopes. "Locals" should be fast, hence it is at the front. As for being editable, I'm not sure what you mean here, but surely you don't object to it being mutable?
An other potential weirdness of this solution — I don't know how ChainMap behaves there, the documentation is unclear — is iteration over the map and mapping.items() versus [(key, mapping[key]) for key in mapping] potentially having very different behaviors/values since the former is going to return all key:value pairs but the latter is only going to return the key:(first value for key) pairs which may lead to significant repetitions any time a key is present in multiple contexts.
No, iteration over the ChainMap returns unique keys, not duplicates.
from collections import ChainMap mapping = ChainMap(dict(a=1, b=2, c=3, d=4)) mapping = mapping.new_child() mapping.update(dict(d=5, e=6, f=7)) mapping = mapping.new_child() mapping.update(dict(f=8, g=9, h=10))
len(mapping) 8 mapping ChainMap({'h': 10, 'g': 9, 'f': 8}, {'e': 6, 'd': 5, 'f': 7}, {'a': 1, 'c': 3, 'b': 2, 'd': 4}) list(mapping.keys()) ['h', 'a', 'c', 'b', 'e', 'd', 'g', 'f'] list(mapping.values()) [10, 1, 3, 2, 6, 5, 9, 8]
-- Steven
On 2012-04-21, at 02:24 , Steven D'Aprano wrote:
Masklinn wrote:
On 2012-04-20, at 23:41 , Paul Moore wrote:
On 20 April 2012 20:49, Masklinn <masklinn@masklinn.net> wrote:
If they're huge mappings, you probably don't want to go around copying them either way[0] and would instead use more custom mappings, either some sort of joining proxy or something out of Okasaki (a clojure-style tree-based map with structural sharing for instance) Python 3.3 has collections.ChainMap for this sort of case. Yeah, it's an example of the "joining proxy" thing. Though I'm not sure I like it being editable, or the lookup order when providing a sequence of maps (I haven't tested it but it appears the maps sequence is traversed front-to-back, I'd have found the other way around more "obvious", as if each sub-mapping was applied to a base through an update call).
ChainMap is meant to emulate scoped lookups
yes, my notes were in the context of the thread considering chainmap as a proxy for multiple mappings, I understand this was not the primary use case for chainmap
, e.g. builtins + globals + nonlocals + locals. Hence, newer scopes mask older scopes. "Locals" should be fast, hence it is at the front.
That's just a question of traversal order for the maps sequence, if the sequence is in the order you specify there: [builtins, globals, nonlocals, locals] then it can be traversed from the back for locals to have the highest priority. The difference in speed should be almost nil
As for being editable, I'm not sure what you mean here, but surely you don't object to it being mutable?
I do, though again that's considering the usage of chainmap as a proxy, not as a scope chain.
An other potential weirdness of this solution — I don't know how ChainMap behaves there, the documentation is unclear — is iteration over the map and mapping.items() versus [(key, mapping[key]) for key in mapping] potentially having very different behaviors/values since the former is going to return all key:value pairs but the latter is only going to return the key:(first value for key) pairs which may lead to significant repetitions any time a key is present in multiple contexts.
No, iteration over the ChainMap returns unique keys, not duplicates.
Ah, that's good. Would probably warrant mention in the documentation though.
Masklinn, 21.04.2012 14:53:
On 2012-04-21, at 02:24 , Steven D'Aprano wrote:
Masklinn wrote:
An other potential weirdness of this solution — I don't know how ChainMap behaves there, the documentation is unclear — is iteration over the map and mapping.items() versus [(key, mapping[key]) for key in mapping] potentially having very different behaviors/values since the former is going to return all key:value pairs but the latter is only going to return the key:(first value for key) pairs which may lead to significant repetitions any time a key is present in multiple contexts.
No, iteration over the ChainMap returns unique keys, not duplicates.
Ah, that's good. Would probably warrant mention in the documentation though.
What would you want to see there? "This class works as expected even when iterating over it"? Stefan
That's not actually true. **kwargs can contain things that aren't strings :) cheers lvh On 20 Apr 2012, at 15:37, Sven Marnach wrote:
Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200:
If you start from dict instances, you could always use:
merged = dict(x, **y)
No, not always. Only if all keys of `y` are strings (and probably they should also be valid Python identifiers.)
Cheers, Sven _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
participants (10)
-
Alexander Belopolsky -
Guido van Rossum -
Laurens Van Houtven -
Masklinn -
Paul Moore -
Stefan Behnel -
Steven D'Aprano -
Sven Marnach -
Victor Varvariuc -
Xavier Ho