Why does BoundArguments use an OrderedDict?
Currently, BoundArguments stores the results of signature.bind (and bind_partial) in an OrderedDict. Given that there is no way to retrieve the order in which keyword arguments were passed, it is not surprising that the dict ordering is based on the order of parameters in the signature, not the order in which arguments are passed (although this point could certainly be made clearer in the docs). But this also means that that ordering could easily be retrieved by iterating over the parameters ("[(name, ba.arguments[name]) for name in ba.signature.parameters if name in ba.arguments]"), and signature.bind now has to pay for the cost of using a Python-implemented OrderedDict. Obviously, one solution would be to switch to a C-implemented OrderedDict, but another simpler one would simply to make BoundArguments.arguments a regular dict rather than an OrderedDict (which is a one-line patch, changing the definition of "arguments" in Signature._bind). A simple test on the example below show approximately a two-fold faster "bind". from inspect import signature s = signature(lambda x, y=1, *, z: None) for _ in range(30000): s.bind(1, z=2) s.bind(1, 3, z=2) s.bind(x=1, z=2, y=3) try: s.bind(1) except TypeError: pass try: s.bind() except TypeError: pass try: s.bind(1, y=2) except TypeError: pass Antony
It really would be helpful to have a C-implemented OrderedDict in stdlib.
On Tue, Dec 16, 2014 at 7:24 PM, Antony Lee
Currently, BoundArguments stores the results of signature.bind (and bind_partial) in an OrderedDict. Given that there is no way to retrieve the order in which keyword arguments were passed, it is not surprising that the dict ordering is based on the order of parameters in the signature, not the order in which arguments are passed (although this point could certainly be made clearer in the docs). But this also means that that ordering could easily be retrieved by iterating over the parameters ("[(name, ba.arguments[name]) for name in ba.signature.parameters if name in ba.arguments]"), and signature.bind now has to pay for the cost of using a Python-implemented OrderedDict.
Obviously, one solution would be to switch to a C-implemented OrderedDict, but another simpler one would simply to make BoundArguments.arguments a regular dict rather than an OrderedDict (which is a one-line patch, changing the definition of "arguments" in Signature._bind). A simple test on the example below show approximately a two-fold faster "bind".
from inspect import signature s = signature(lambda x, y=1, *, z: None) for _ in range(30000): s.bind(1, z=2) s.bind(1, 3, z=2) s.bind(x=1, z=2, y=3) try: s.bind(1) except TypeError: pass try: s.bind() except TypeError: pass try: s.bind(1, y=2) except TypeError: pass
Antony
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Tue, Dec 16, 2014 at 7:45 PM, Guido van Rossum
On Tue, Dec 16, 2014 at 6:27 PM, Wes Turner
wrote: It really would be helpful to have a C-implemented OrderedDict in stdlib.
+1
http://bugs.python.org/issue16991 Reviews welcome! :) The patch is undoubtedly stale (after a year and a half). It definitely could use some trimming (basically +2500 lines), but a lot of the bulk comes from the boilerplate involved in writing iterators (and the odict views) at the C/C-API level. -eric
On Tue, Dec 16, 2014 at 11:18 PM, Eric Snow
The patch is undoubtedly stale (after a year and a half).
FWIW, I'm finally starting to have some spare time so I may be able to refresh the patch. However, if I can't get reviews I imagine it will just get stale again, so I don't know if I can devote the time in that case. If anyone has suggestions on how I can split the patch up to make it more easily reviewed, or on any other ways to help reviewers of the patch, please leave a comment on the issue (#16991). I'd love to get the C-OrderedDict landed! -eric
Maybe this would be an opportunity for you to mentor someone who wants to
get into serious C-level Python maintenance? I think that would qualify as
a code review, if your student is serious about learning how the C code
you've written works. Perhaps an ad on core-mentorship might turn up such a
person.
On Tue, Dec 16, 2014 at 10:25 PM, Eric Snow
On Tue, Dec 16, 2014 at 11:18 PM, Eric Snow
wrote: The patch is undoubtedly stale (after a year and a half).
FWIW, I'm finally starting to have some spare time so I may be able to refresh the patch. However, if I can't get reviews I imagine it will just get stale again, so I don't know if I can devote the time in that case. If anyone has suggestions on how I can split the patch up to make it more easily reviewed, or on any other ways to help reviewers of the patch, please leave a comment on the issue (#16991). I'd love to get the C-OrderedDict landed!
-eric
-- --Guido van Rossum (python.org/~guido)
On Wed, Dec 17, 2014 at 9:34 AM, Guido van Rossum
Maybe this would be an opportunity for you to mentor someone who wants to get into serious C-level Python maintenance? I think that would qualify as a code review, if your student is serious about learning how the C code you've written works. Perhaps an ad on core-mentorship might turn up such a person.
Good idea. I'll look into that. -eric
On Tue, 16 Dec 2014 23:18:55 -0700
Eric Snow
On Tue, Dec 16, 2014 at 7:45 PM, Guido van Rossum
wrote: On Tue, Dec 16, 2014 at 6:27 PM, Wes Turner
wrote: It really would be helpful to have a C-implemented OrderedDict in stdlib.
+1
http://bugs.python.org/issue16991
Reviews welcome! :)
The patch is undoubtedly stale (after a year and a half). It definitely could use some trimming (basically +2500 lines), but a lot of the bulk comes from the boilerplate involved in writing iterators (and the odict views) at the C/C-API level.
I have reservations about the overall approach. Should it really be a builtin and expose a public C API? It sounds like putting it in the _collections module should be sufficient. Regards Antoine.
Yes, it should stay in collections.
On Dec 17, 2014 10:12 AM, "Antoine Pitrou"
On Tue, 16 Dec 2014 23:18:55 -0700 Eric Snow
wrote: On Tue, Dec 16, 2014 at 7:45 PM, Guido van Rossum
wrote: On Tue, Dec 16, 2014 at 6:27 PM, Wes Turner
wrote: It really would be helpful to have a C-implemented OrderedDict in
stdlib.
+1
http://bugs.python.org/issue16991
Reviews welcome! :)
The patch is undoubtedly stale (after a year and a half). It definitely could use some trimming (basically +2500 lines), but a lot of the bulk comes from the boilerplate involved in writing iterators (and the odict views) at the C/C-API level.
I have reservations about the overall approach. Should it really be a builtin and expose a public C API? It sounds like putting it in the _collections module should be sufficient.
Regards
Antoine.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Dec 17, 2014 at 11:11 AM, Antoine Pitrou
I have reservations about the overall approach. Should it really be a builtin
The patch doesn't add it to bltinmodule.c. Perhaps you mean the additions to Python.h, object.c, and the Makefile. In that case then I actually agree. I was probably just anticipating the use of OrderedDict by the interpreter. However, that can wait until needed.
and expose a public C API?
Yeah, it really doesn't need that either (if I remember right). If a C-API becomes desirable then it would be easy to add later.
It sounds like putting it in the _collections module should be sufficient.
Agreed. -eric
On Tue, 16 Dec 2014 23:18:55 -0700
Eric Snow
On Tue, Dec 16, 2014 at 7:45 PM, Guido van Rossum
wrote: On Tue, Dec 16, 2014 at 6:27 PM, Wes Turner
wrote: It really would be helpful to have a C-implemented OrderedDict in stdlib.
+1
http://bugs.python.org/issue16991
Reviews welcome! :)
Other reservations: - why aren't the linked list pointers not included in the hash entries? that should simplify some logic quite a bit - is it useful to keep the shared / split keys mechanism? shared keys are mostly useful for instance dicts - does it have to inherit from dict? that looks like a potential can of worms Regards Antoine.
BTW, thanks for the feedback. :)
On Wed, Dec 17, 2014 at 11:30 AM, Antoine Pitrou
Other reservations:
- why aren't the linked list pointers not included in the hash entries? that should simplify some logic quite a bit
I'll look into that. If I recall correctly this came up in earlier discussion and there was a justification.
- is it useful to keep the shared / split keys mechanism? shared keys are mostly useful for instance dicts
I'm pretty sure this is an artifact of subclassing dict.
- does it have to inherit from dict? that looks like a potential can of worms
One of the constraints here is to make the C implementation of OrderedDict match the API of the Python type exactly and the underlying implementation as closely as reasonable. The Python type subclasses dict so the C implementation does as well. FWIW (and not directed to Antoine specifically), the implementation up for review may not be ideal, but it exists and is complete. :) Barring any substantial concerns during review (and the points Antoine has brought up), I would rather the patch landed than wait indefinitely for a more ideal implementation. That could happen afterward, though I'm fairly confident in the correctness and efficiency of the implementation. -eric
On 18 December 2014 at 07:26, Eric Snow
- does it have to inherit from dict? that looks like a potential can of worms
One of the constraints here is to make the C implementation of OrderedDict match the API of the Python type exactly and the underlying implementation as closely as reasonable. The Python type subclasses dict so the C implementation does as well.
This kind of design rationale is potentially useful to include inline as a block comment. The kinds of questions reviewers have are often going to be the same kinds of questions future maintainers have, and "why" data regarding core architectural decisions is less likely to go out of date as future maintainers make changes.
FWIW (and not directed to Antoine specifically), the implementation up for review may not be ideal, but it exists and is complete. :) Barring any substantial concerns during review (and the points Antoine has brought up), I would rather the patch landed than wait indefinitely for a more ideal implementation. That could happen afterward, though I'm fairly confident in the correctness and efficiency of the implementation.
Aye, a definite +1 for "better" over "perfect". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
The discussion has drifted towards improving OrderedDict (something I
certainly approve), but the semantic question is still there: why should
BoundArguments.arguments be ordered by the parameter order? For example,
the recipe just below in the docs, for filling in the remaining default
arguments, breaks that ordering, without even mentioning that.
Antony
2014-12-17 16:37 GMT-08:00 Nick Coghlan
On 18 December 2014 at 07:26, Eric Snow
wrote: - does it have to inherit from dict? that looks like a potential can of worms
One of the constraints here is to make the C implementation of OrderedDict match the API of the Python type exactly and the underlying implementation as closely as reasonable. The Python type subclasses dict so the C implementation does as well.
This kind of design rationale is potentially useful to include inline as a block comment. The kinds of questions reviewers have are often going to be the same kinds of questions future maintainers have, and "why" data regarding core architectural decisions is less likely to go out of date as future maintainers make changes.
FWIW (and not directed to Antoine specifically), the implementation up for review may not be ideal, but it exists and is complete. :) Barring any substantial concerns during review (and the points Antoine has brought up), I would rather the patch landed than wait indefinitely for a more ideal implementation. That could happen afterward, though I'm fairly confident in the correctness and efficiency of the implementation.
Aye, a definite +1 for "better" over "perfect".
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Dec 17, 2014 at 6:15 PM, Antony Lee
The discussion has drifted towards improving OrderedDict (something I certainly approve), but the semantic question is still there: why should BoundArguments.arguments be ordered by the parameter order? For example, the recipe just below in the docs, for filling in the remaining default arguments, breaks that ordering, without even mentioning that.
Given that the answer hasn't been answered yet, perhaps nobody remembers the reason any more. But why does it bother you so much? In my experience the inspect module wasn't written for performance but for functionality. If it really hurts your ability to do something that you need to be fast, perhaps you can supply a patch? Make sure to run the tests and update the docs as needed. -- --Guido van Rossum (python.org/~guido)
On 18 December 2014 at 12:57, Guido van Rossum
On Wed, Dec 17, 2014 at 6:15 PM, Antony Lee
wrote: The discussion has drifted towards improving OrderedDict (something I certainly approve), but the semantic question is still there: why should BoundArguments.arguments be ordered by the parameter order? For example, the recipe just below in the docs, for filling in the remaining default arguments, breaks that ordering, without even mentioning that.
Given that the answer hasn't been answered yet, perhaps nobody remembers the reason any more. But why does it bother you so much? In my experience the inspect module wasn't written for performance but for functionality. If it really hurts your ability to do something that you need to be fast, perhaps you can supply a patch? Make sure to run the tests and update the docs as needed.
As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments). The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
2014-12-17 23:05 GMT-08:00 Nick Coghlan
On Wed, Dec 17, 2014 at 6:15 PM, Antony Lee
wrote: The discussion has drifted towards improving OrderedDict (something I certainly approve), but the semantic question is still there: why should BoundArguments.arguments be ordered by the parameter order? For
example,
the recipe just below in the docs, for filling in the remaining default arguments, breaks that ordering, without even mentioning that.
Given that the answer hasn't been answered yet, perhaps nobody remembers
reason any more. But why does it bother you so much? In my experience the inspect module wasn't written for performance but for functionality. If it really hurts your ability to do something that you need to be fast,
On 18 December 2014 at 12:57, Guido van Rossum
wrote: the perhaps you can supply a patch? Make sure to run the tests and update the docs as needed.
Actually the second example in PEP362 uses bind as an argument type-checker, which should arguably be as fast as possible. I'm opening an issue with a patch right now.
As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments).
That can't really be the case as BoundArguments doesn't define a
__repr__... (and OrderedDict's __repr__ is a bit too verbose IMO, but
that's another issue and little can be done about it anyways). Now that I
delved into the details of BoundArgument's implementation, I'll also note
that __eq__ also checks the arguments order, and again additional binding
will mess that up:
s = signature(lambda x=None, y=None: None)
ba0 = s.bind()
ba1 = s.bind(x=None)
ba2 = s.bind(y=None)
The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, Dec 17, 2014 at 6:15 PM, Antony Lee
wrote: The discussion has drifted towards improving OrderedDict (something I certainly approve), but the semantic question is still there: why should BoundArguments.arguments be ordered by the parameter order? For example, the recipe just below in the docs, for filling in the remaining default arguments, breaks that ordering, without even mentioning that.
Given that the answer hasn't been answered yet, perhaps nobody remembers
reason any more. But why does it bother you so much? In my experience the inspect module wasn't written for performance but for functionality. If it really hurts your ability to do something that you need to be fast,
On Thu, Dec 18, 2014, 04:46 Antony Lee
you can supply a patch? Make sure to run the tests and update the docs as needed.
Actually the second example in PEP362 uses bind as an argument
type-checker, which should arguably be as fast as possible. I'm opening an
issue with a patch right now.
As the author of PEP 362 I can tell you that was an example and not
something to optimize for. If you want to type check your code then you
should do it offline or only during debugging in which case performance is
not an issue.
-Brett
As far as I'm aware, it's an ordered dictionary because that makes the
default repr() predictable when binding arguments for a given function
(in the absence of after-the-fact manipulation like the example in the
docs that injects the default values as explicitly bound arguments).
That can't really be the case as BoundArguments doesn't define a
__repr__... (and OrderedDict's __repr__ is a bit too verbose IMO, but
that's another issue and little can be done about it anyways). Now that I
delved into the details of BoundArgument's implementation, I'll also note
that __eq__ also checks the arguments order, and again additional binding
will mess that up:
s = signature(lambda x=None, y=None: None)
ba0 = s.bind()
ba1 = s.bind(x=None)
ba2 = s.bind(y=None)
On Thu, Dec 18, 2014, at 02:05, Nick Coghlan wrote:
As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments).
The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation.
Would it be reasonable to make a lightweight "predictable dict" class that makes a weaker guarantee, e.g. that the enumeration order will match the insertion order in the case where it is filled from empty with no intervening deletions and not guaranteed in any other cases?
Would it be reasonable to make a lightweight "predictable dict" class that makes a weaker guarantee, e.g. that the enumeration order will match the insertion order in the case where it is filled from empty with no intervening deletions and not guaranteed in any other cases?
Something like DefaultOrderedDict (like defaultdict)?
Would this make it easy/faster to also have a DefaultOrderedDict (which can/could also be accomplished with .get(attr, []) and .setdefault(attr, [])?
On Thu, Dec 18, 2014 at 10:09 AM,
On Thu, Dec 18, 2014, at 02:05, Nick Coghlan wrote:
As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments).
The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation.
Would it be reasonable to make a lightweight "predictable dict" class that makes a weaker guarantee, e.g. that the enumeration order will match the insertion order in the case where it is filled from empty with no intervening deletions and not guaranteed in any other cases? _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, Dec 18, 2014, at 11:18, Wes Turner wrote:
Would it be reasonable to make a lightweight "predictable dict" class that makes a weaker guarantee, e.g. that the enumeration order will match the insertion order in the case where it is filled from empty with no intervening deletions and not guaranteed in any other cases?
Something like DefaultOrderedDict (like defaultdict)?
From http://bugs.python.org/issue16991#msg232825 :
Would this make it easy/faster to also have a DefaultOrderedDict (which can/could also be accomplished with .get(attr, []) and .setdefault(attr, [])?
Really, I'm thinking more along the lines of a dict that only guarantees enumeration order (no integer indexed access) in the specific cases where people "need" it (ordered kwargs for e.g. the OrderedDict constructor would be another example), while being 'lightweight' enough (in terms of not having a lot of extra machinery dedicated to maintaining the order) to ultimately be used as the real dict implementation (and therefore usable for kwargs, class dictionaries, etc) Your comment's "DefaultOrderedDict" seems to be more about combining DefaultDict (default values) and OrderedDict (ordered all the time, indexed access) rather than anything like having a "default order".
On Thu, Dec 18, 2014 at 10:27 AM,
On Thu, Dec 18, 2014, at 11:18, Wes Turner wrote:
Would it be reasonable to make a lightweight "predictable dict" class that makes a weaker guarantee, e.g. that the enumeration order will match the insertion order in the case where it is filled from empty with no intervening deletions and not guaranteed in any other cases?
Something like DefaultOrderedDict (like defaultdict)?
From http://bugs.python.org/issue16991#msg232825 :
Would this make it easy/faster to also have a DefaultOrderedDict (which can/could also be accomplished with .get(attr, []) and .setdefault(attr, [])?
Really, I'm thinking more along the lines of a dict that only guarantees enumeration order (no integer indexed access) in the specific cases where people "need" it (ordered kwargs for e.g. the OrderedDict constructor would be another example), while being 'lightweight' enough (in terms of not having a lot of extra machinery dedicated to maintaining the order) to ultimately be used as the real dict implementation (and therefore usable for kwargs, class dictionaries, etc)
I understand. More like the function of DEFAULT_PREDICATE_ORDERING here: https://github.com/westurner/healthref/blob/gh-pages/healthref.py#L100 . That would indeed be useful. What would you call it?
On 19 December 2014 at 03:09,
On Thu, Dec 18, 2014, at 02:05, Nick Coghlan wrote:
As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments).
The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation.
Would it be reasonable to make a lightweight "predictable dict" class that makes a weaker guarantee, e.g. that the enumeration order will match the insertion order in the case where it is filled from empty with no intervening deletions and not guaranteed in any other cases?
My understanding is that Raymond's alternative dict implementation works exactly like this, and is noted as an alternative for PEP 468: https://www.python.org/dev/peps/pep-0468/ . Discussion starts at: https://mail.python.org/pipermail/python-dev/2012-December/123028.html Tim Delaney
On Thu, Dec 18, 2014 at 9:18 PM, Tim Delaney
On 19 December 2014 at 03:09,
wrote: On Thu, Dec 18, 2014, at 02:05, Nick Coghlan wrote:
As far as I'm aware, it's an ordered dictionary because that makes the default repr() predictable when binding arguments for a given function (in the absence of after-the-fact manipulation like the example in the docs that injects the default values as explicitly bound arguments).
The inspect.signature() machinery includes quite a few things like that where the serialisation as a human readable string is considered as important then the programmatic representation.
Would it be reasonable to make a lightweight "predictable dict" class that makes a weaker guarantee, e.g. that the enumeration order will match the insertion order in the case where it is filled from empty with no intervening deletions and not guaranteed in any other cases?
I believe that if there's a new predictabledict type, and deleting from it would invalidate the order, then predictabledict should not support deletions. Perhaps it should be all-out immutable to prevent surprises. It would break backwards compatibility if that was used for **kwargs, though.
My understanding is that Raymond's alternative dict implementation works exactly like this, and is noted as an alternative for PEP 468: https://www.python.org/dev/peps/pep-0468/ .
Discussion starts at: https://mail.python.org/pipermail/python-dev/2012-December/123028.html
On Fri, Dec 19, 2014, at 04:07, Petr Viktorin wrote:
I believe that if there's a new predictabledict type, and deleting from it would invalidate the order, then predictabledict should not support deletions.
This would defeat the broader idea of using an implementation with such properties as the standard dict (which would allow its use for kwargs, class dictionaries, etc). What about adding a flag to indicate whether it is in an ordered state (i.e. either there have been no deletions since it was last empty, or there have been no insertions since the first deletion after it was last empty, or whatever other status applies to the implementation).
participants (11)
-
Alexander Belopolsky
-
Antoine Pitrou
-
Antony Lee
-
Brett Cannon
-
Eric Snow
-
Guido van Rossum
-
Nick Coghlan
-
Petr Viktorin
-
random832@fastmail.us
-
Tim Delaney
-
Wes Turner