Why does BoundArguments use an OrderedDict?

Currently, BoundArguments stores the results of signature.bind (and bind_partial) in an OrderedDict. Given that there is no way to retrieve the order in which keyword arguments were passed, it is not surprising that the dict ordering is based on the order of parameters in the signature, not the order in which arguments are passed (although this point could certainly be made clearer in the docs). But this also means that that ordering could easily be retrieved by iterating over the parameters ("[(name, ba.arguments[name]) for name in ba.signature.parameters if name in ba.arguments]"), and signature.bind now has to pay for the cost of using a Python-implemented OrderedDict. Obviously, one solution would be to switch to a C-implemented OrderedDict, but another simpler one would simply to make BoundArguments.arguments a regular dict rather than an OrderedDict (which is a one-line patch, changing the definition of "arguments" in Signature._bind). A simple test on the example below show approximately a two-fold faster "bind". from inspect import signature s = signature(lambda x, y=1, *, z: None) for _ in range(30000): s.bind(1, z=2) s.bind(1, 3, z=2) s.bind(x=1, z=2, y=3) try: s.bind(1) except TypeError: pass try: s.bind() except TypeError: pass try: s.bind(1, y=2) except TypeError: pass Antony

On Tue, Dec 16, 2014 at 7:45 PM, Guido van Rossum <guido@python.org> wrote:
http://bugs.python.org/issue16991 Reviews welcome! :) The patch is undoubtedly stale (after a year and a half). It definitely could use some trimming (basically +2500 lines), but a lot of the bulk comes from the boilerplate involved in writing iterators (and the odict views) at the C/C-API level. -eric

On Tue, Dec 16, 2014 at 11:18 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
The patch is undoubtedly stale (after a year and a half).
FWIW, I'm finally starting to have some spare time so I may be able to refresh the patch. However, if I can't get reviews I imagine it will just get stale again, so I don't know if I can devote the time in that case. If anyone has suggestions on how I can split the patch up to make it more easily reviewed, or on any other ways to help reviewers of the patch, please leave a comment on the issue (#16991). I'd love to get the C-OrderedDict landed! -eric

Maybe this would be an opportunity for you to mentor someone who wants to get into serious C-level Python maintenance? I think that would qualify as a code review, if your student is serious about learning how the C code you've written works. Perhaps an ad on core-mentorship might turn up such a person. On Tue, Dec 16, 2014 at 10:25 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Wed, Dec 17, 2014 at 11:11 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I have reservations about the overall approach. Should it really be a builtin
The patch doesn't add it to bltinmodule.c. Perhaps you mean the additions to Python.h, object.c, and the Makefile. In that case then I actually agree. I was probably just anticipating the use of OrderedDict by the interpreter. However, that can wait until needed.
and expose a public C API?
Yeah, it really doesn't need that either (if I remember right). If a C-API becomes desirable then it would be easy to add later.
It sounds like putting it in the _collections module should be sufficient.
Agreed. -eric

On Tue, 16 Dec 2014 23:18:55 -0700 Eric Snow <ericsnowcurrently@gmail.com> wrote:
Other reservations: - why aren't the linked list pointers not included in the hash entries? that should simplify some logic quite a bit - is it useful to keep the shared / split keys mechanism? shared keys are mostly useful for instance dicts - does it have to inherit from dict? that looks like a potential can of worms Regards Antoine.

BTW, thanks for the feedback. :) On Wed, Dec 17, 2014 at 11:30 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I'll look into that. If I recall correctly this came up in earlier discussion and there was a justification.
- is it useful to keep the shared / split keys mechanism? shared keys are mostly useful for instance dicts
I'm pretty sure this is an artifact of subclassing dict.
- does it have to inherit from dict? that looks like a potential can of worms
One of the constraints here is to make the C implementation of OrderedDict match the API of the Python type exactly and the underlying implementation as closely as reasonable. The Python type subclasses dict so the C implementation does as well. FWIW (and not directed to Antoine specifically), the implementation up for review may not be ideal, but it exists and is complete. :) Barring any substantial concerns during review (and the points Antoine has brought up), I would rather the patch landed than wait indefinitely for a more ideal implementation. That could happen afterward, though I'm fairly confident in the correctness and efficiency of the implementation. -eric

On 18 December 2014 at 07:26, Eric Snow <ericsnowcurrently@gmail.com> wrote:
This kind of design rationale is potentially useful to include inline as a block comment. The kinds of questions reviewers have are often going to be the same kinds of questions future maintainers have, and "why" data regarding core architectural decisions is less likely to go out of date as future maintainers make changes.
Aye, a definite +1 for "better" over "perfect". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

The discussion has drifted towards improving OrderedDict (something I certainly approve), but the semantic question is still there: why should BoundArguments.arguments be ordered by the parameter order? For example, the recipe just below in the docs, for filling in the remaining default arguments, breaks that ordering, without even mentioning that. Antony 2014-12-17 16:37 GMT-08:00 Nick Coghlan <ncoghlan@gmail.com>:

On Wed, Dec 17, 2014 at 6:15 PM, Antony Lee <antony.lee@berkeley.edu> wrote:
Given that the answer hasn't been answered yet, perhaps nobody remembers the reason any more. But why does it bother you so much? In my experience the inspect module wasn't written for performance but for functionality. If it really hurts your ability to do something that you need to be fast, perhaps you can supply a patch? Make sure to run the tests and update the docs as needed. -- --Guido van Rossum (python.org/~guido)

On 18 December 2014 at 12:57, Guido van Rossum <guido@python.org> wrote:
As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments). The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

2014-12-17 23:05 GMT-08:00 Nick Coghlan <ncoghlan@gmail.com>:
Actually the second example in PEP362 uses bind as an argument type-checker, which should arguably be as fast as possible. I'm opening an issue with a patch right now.
That can't really be the case as BoundArguments doesn't define a __repr__... (and OrderedDict's __repr__ is a bit too verbose IMO, but that's another issue and little can be done about it anyways). Now that I delved into the details of BoundArgument's implementation, I'll also note that __eq__ also checks the arguments order, and again additional binding will mess that up: s = signature(lambda x=None, y=None: None) ba0 = s.bind() ba1 = s.bind(x=None) ba2 = s.bind(y=None) <apply recipe to add default arguments to ba0, ba1 and ba2> Should ba0, ba1 and ba2 compare now equal or not? As it is, we'll have ba0 == ba1 != ba2, which doesn't strike me as particularly reasonable. Antony

On Thu, Dec 18, 2014, 04:46 Antony Lee <antony.lee@berkeley.edu> wrote: 2014-12-17 23:05 GMT-08:00 Nick Coghlan <ncoghlan@gmail.com>: On 18 December 2014 at 12:57, Guido van Rossum <guido@python.org> wrote: the perhaps
you can supply a patch? Make sure to run the tests and update the docs as needed.
Actually the second example in PEP362 uses bind as an argument type-checker, which should arguably be as fast as possible. I'm opening an issue with a patch right now. As the author of PEP 362 I can tell you that was an example and not something to optimize for. If you want to type check your code then you should do it offline or only during debugging in which case performance is not an issue. -Brett As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments). That can't really be the case as BoundArguments doesn't define a __repr__... (and OrderedDict's __repr__ is a bit too verbose IMO, but that's another issue and little can be done about it anyways). Now that I delved into the details of BoundArgument's implementation, I'll also note that __eq__ also checks the arguments order, and again additional binding will mess that up: s = signature(lambda x=None, y=None: None) ba0 = s.bind() ba1 = s.bind(x=None) ba2 = s.bind(y=None) <apply recipe to add default arguments to ba0, ba1 and ba2> Should ba0, ba1 and ba2 compare now equal or not? As it is, we'll have ba0 == ba1 != ba2, which doesn't strike me as particularly reasonable. Antony The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On Thu, Dec 18, 2014, at 02:05, Nick Coghlan wrote:
Would it be reasonable to make a lightweight "predictable dict" class that makes a weaker guarantee, e.g. that the enumeration order will match the insertion order in the case where it is filled from empty with no intervening deletions and not guaranteed in any other cases?

On Thu, Dec 18, 2014, at 11:18, Wes Turner wrote:
Really, I'm thinking more along the lines of a dict that only guarantees enumeration order (no integer indexed access) in the specific cases where people "need" it (ordered kwargs for e.g. the OrderedDict constructor would be another example), while being 'lightweight' enough (in terms of not having a lot of extra machinery dedicated to maintaining the order) to ultimately be used as the real dict implementation (and therefore usable for kwargs, class dictionaries, etc) Your comment's "DefaultOrderedDict" seems to be more about combining DefaultDict (default values) and OrderedDict (ordered all the time, indexed access) rather than anything like having a "default order".

On Thu, Dec 18, 2014 at 10:27 AM, <random832@fastmail.us> wrote:
I understand. More like the function of DEFAULT_PREDICATE_ORDERING here: https://github.com/westurner/healthref/blob/gh-pages/healthref.py#L100 . That would indeed be useful. What would you call it?

On 19 December 2014 at 03:09, <random832@fastmail.us> wrote:
My understanding is that Raymond's alternative dict implementation works exactly like this, and is noted as an alternative for PEP 468: https://www.python.org/dev/peps/pep-0468/ . Discussion starts at: https://mail.python.org/pipermail/python-dev/2012-December/123028.html Tim Delaney

On Thu, Dec 18, 2014 at 9:18 PM, Tim Delaney <timothy.c.delaney@gmail.com> wrote:
I believe that if there's a new predictabledict type, and deleting from it would invalidate the order, then predictabledict should not support deletions. Perhaps it should be all-out immutable to prevent surprises. It would break backwards compatibility if that was used for **kwargs, though.

On Fri, Dec 19, 2014, at 04:07, Petr Viktorin wrote:
This would defeat the broader idea of using an implementation with such properties as the standard dict (which would allow its use for kwargs, class dictionaries, etc). What about adding a flag to indicate whether it is in an ordered state (i.e. either there have been no deletions since it was last empty, or there have been no insertions since the first deletion after it was last empty, or whatever other status applies to the implementation).

On Tue, Dec 16, 2014 at 7:45 PM, Guido van Rossum <guido@python.org> wrote:
http://bugs.python.org/issue16991 Reviews welcome! :) The patch is undoubtedly stale (after a year and a half). It definitely could use some trimming (basically +2500 lines), but a lot of the bulk comes from the boilerplate involved in writing iterators (and the odict views) at the C/C-API level. -eric

On Tue, Dec 16, 2014 at 11:18 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
The patch is undoubtedly stale (after a year and a half).
FWIW, I'm finally starting to have some spare time so I may be able to refresh the patch. However, if I can't get reviews I imagine it will just get stale again, so I don't know if I can devote the time in that case. If anyone has suggestions on how I can split the patch up to make it more easily reviewed, or on any other ways to help reviewers of the patch, please leave a comment on the issue (#16991). I'd love to get the C-OrderedDict landed! -eric

Maybe this would be an opportunity for you to mentor someone who wants to get into serious C-level Python maintenance? I think that would qualify as a code review, if your student is serious about learning how the C code you've written works. Perhaps an ad on core-mentorship might turn up such a person. On Tue, Dec 16, 2014 at 10:25 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Tue, 16 Dec 2014 23:18:55 -0700 Eric Snow <ericsnowcurrently@gmail.com> wrote:
I have reservations about the overall approach. Should it really be a builtin and expose a public C API? It sounds like putting it in the _collections module should be sufficient. Regards Antoine.

On Wed, Dec 17, 2014 at 11:11 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I have reservations about the overall approach. Should it really be a builtin
The patch doesn't add it to bltinmodule.c. Perhaps you mean the additions to Python.h, object.c, and the Makefile. In that case then I actually agree. I was probably just anticipating the use of OrderedDict by the interpreter. However, that can wait until needed.
and expose a public C API?
Yeah, it really doesn't need that either (if I remember right). If a C-API becomes desirable then it would be easy to add later.
It sounds like putting it in the _collections module should be sufficient.
Agreed. -eric

On Tue, 16 Dec 2014 23:18:55 -0700 Eric Snow <ericsnowcurrently@gmail.com> wrote:
Other reservations: - why aren't the linked list pointers not included in the hash entries? that should simplify some logic quite a bit - is it useful to keep the shared / split keys mechanism? shared keys are mostly useful for instance dicts - does it have to inherit from dict? that looks like a potential can of worms Regards Antoine.

BTW, thanks for the feedback. :) On Wed, Dec 17, 2014 at 11:30 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I'll look into that. If I recall correctly this came up in earlier discussion and there was a justification.
- is it useful to keep the shared / split keys mechanism? shared keys are mostly useful for instance dicts
I'm pretty sure this is an artifact of subclassing dict.
- does it have to inherit from dict? that looks like a potential can of worms
One of the constraints here is to make the C implementation of OrderedDict match the API of the Python type exactly and the underlying implementation as closely as reasonable. The Python type subclasses dict so the C implementation does as well. FWIW (and not directed to Antoine specifically), the implementation up for review may not be ideal, but it exists and is complete. :) Barring any substantial concerns during review (and the points Antoine has brought up), I would rather the patch landed than wait indefinitely for a more ideal implementation. That could happen afterward, though I'm fairly confident in the correctness and efficiency of the implementation. -eric

On 18 December 2014 at 07:26, Eric Snow <ericsnowcurrently@gmail.com> wrote:
This kind of design rationale is potentially useful to include inline as a block comment. The kinds of questions reviewers have are often going to be the same kinds of questions future maintainers have, and "why" data regarding core architectural decisions is less likely to go out of date as future maintainers make changes.
Aye, a definite +1 for "better" over "perfect". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

The discussion has drifted towards improving OrderedDict (something I certainly approve), but the semantic question is still there: why should BoundArguments.arguments be ordered by the parameter order? For example, the recipe just below in the docs, for filling in the remaining default arguments, breaks that ordering, without even mentioning that. Antony 2014-12-17 16:37 GMT-08:00 Nick Coghlan <ncoghlan@gmail.com>:

On Wed, Dec 17, 2014 at 6:15 PM, Antony Lee <antony.lee@berkeley.edu> wrote:
Given that the answer hasn't been answered yet, perhaps nobody remembers the reason any more. But why does it bother you so much? In my experience the inspect module wasn't written for performance but for functionality. If it really hurts your ability to do something that you need to be fast, perhaps you can supply a patch? Make sure to run the tests and update the docs as needed. -- --Guido van Rossum (python.org/~guido)

On 18 December 2014 at 12:57, Guido van Rossum <guido@python.org> wrote:
As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments). The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

2014-12-17 23:05 GMT-08:00 Nick Coghlan <ncoghlan@gmail.com>:
Actually the second example in PEP362 uses bind as an argument type-checker, which should arguably be as fast as possible. I'm opening an issue with a patch right now.
That can't really be the case as BoundArguments doesn't define a __repr__... (and OrderedDict's __repr__ is a bit too verbose IMO, but that's another issue and little can be done about it anyways). Now that I delved into the details of BoundArgument's implementation, I'll also note that __eq__ also checks the arguments order, and again additional binding will mess that up: s = signature(lambda x=None, y=None: None) ba0 = s.bind() ba1 = s.bind(x=None) ba2 = s.bind(y=None) <apply recipe to add default arguments to ba0, ba1 and ba2> Should ba0, ba1 and ba2 compare now equal or not? As it is, we'll have ba0 == ba1 != ba2, which doesn't strike me as particularly reasonable. Antony

On Thu, Dec 18, 2014, 04:46 Antony Lee <antony.lee@berkeley.edu> wrote: 2014-12-17 23:05 GMT-08:00 Nick Coghlan <ncoghlan@gmail.com>: On 18 December 2014 at 12:57, Guido van Rossum <guido@python.org> wrote: the perhaps
you can supply a patch? Make sure to run the tests and update the docs as needed.
Actually the second example in PEP362 uses bind as an argument type-checker, which should arguably be as fast as possible. I'm opening an issue with a patch right now. As the author of PEP 362 I can tell you that was an example and not something to optimize for. If you want to type check your code then you should do it offline or only during debugging in which case performance is not an issue. -Brett As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments). That can't really be the case as BoundArguments doesn't define a __repr__... (and OrderedDict's __repr__ is a bit too verbose IMO, but that's another issue and little can be done about it anyways). Now that I delved into the details of BoundArgument's implementation, I'll also note that __eq__ also checks the arguments order, and again additional binding will mess that up: s = signature(lambda x=None, y=None: None) ba0 = s.bind() ba1 = s.bind(x=None) ba2 = s.bind(y=None) <apply recipe to add default arguments to ba0, ba1 and ba2> Should ba0, ba1 and ba2 compare now equal or not? As it is, we'll have ba0 == ba1 != ba2, which doesn't strike me as particularly reasonable. Antony The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On Thu, Dec 18, 2014, at 02:05, Nick Coghlan wrote:
Would it be reasonable to make a lightweight "predictable dict" class that makes a weaker guarantee, e.g. that the enumeration order will match the insertion order in the case where it is filled from empty with no intervening deletions and not guaranteed in any other cases?

On Thu, Dec 18, 2014, at 11:18, Wes Turner wrote:
Really, I'm thinking more along the lines of a dict that only guarantees enumeration order (no integer indexed access) in the specific cases where people "need" it (ordered kwargs for e.g. the OrderedDict constructor would be another example), while being 'lightweight' enough (in terms of not having a lot of extra machinery dedicated to maintaining the order) to ultimately be used as the real dict implementation (and therefore usable for kwargs, class dictionaries, etc) Your comment's "DefaultOrderedDict" seems to be more about combining DefaultDict (default values) and OrderedDict (ordered all the time, indexed access) rather than anything like having a "default order".

On Thu, Dec 18, 2014 at 10:27 AM, <random832@fastmail.us> wrote:
I understand. More like the function of DEFAULT_PREDICATE_ORDERING here: https://github.com/westurner/healthref/blob/gh-pages/healthref.py#L100 . That would indeed be useful. What would you call it?

On 19 December 2014 at 03:09, <random832@fastmail.us> wrote:
My understanding is that Raymond's alternative dict implementation works exactly like this, and is noted as an alternative for PEP 468: https://www.python.org/dev/peps/pep-0468/ . Discussion starts at: https://mail.python.org/pipermail/python-dev/2012-December/123028.html Tim Delaney

On Thu, Dec 18, 2014 at 9:18 PM, Tim Delaney <timothy.c.delaney@gmail.com> wrote:
I believe that if there's a new predictabledict type, and deleting from it would invalidate the order, then predictabledict should not support deletions. Perhaps it should be all-out immutable to prevent surprises. It would break backwards compatibility if that was used for **kwargs, though.

On Fri, Dec 19, 2014, at 04:07, Petr Viktorin wrote:
This would defeat the broader idea of using an implementation with such properties as the standard dict (which would allow its use for kwargs, class dictionaries, etc). What about adding a flag to indicate whether it is in an ordered state (i.e. either there have been no deletions since it was last empty, or there have been no insertions since the first deletion after it was last empty, or whatever other status applies to the implementation).
participants (11)
-
Alexander Belopolsky
-
Antoine Pitrou
-
Antony Lee
-
Brett Cannon
-
Eric Snow
-
Guido van Rossum
-
Nick Coghlan
-
Petr Viktorin
-
random832@fastmail.us
-
Tim Delaney
-
Wes Turner