Proto-PEP: Preserving the order of **kwargs in a function.
Here's the proper proposal I promised. Hopefully it reflects our discussions and the past ones too. My goal here is to focus in on what I consider to be the most viable approach, which capturing the gist of the various concerns and alternatives. There are still a few small gaps that I will work on as time permits. Feedback is welcome. FYI, I also hope to have a basic implementation up in time for the language summit at PyCon (Wednesday) so that related discussions there might have something more concrete to support them. :) -eric ===================================================================== PEP: XXX Title: Preserving the order of \*\*kwargs in a function. Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow <ericsnowcurrently@gmail.com> Discussions-To: python-ideas@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 5-Apr-2014 Python-Version: 3.5 Post-History: Resolution: Abstract ======== The \*\*kwargs syntax in a function definition indicates that the interpreter should collect all keyword arguments that do not correspond to other named parameters. However, Python does not preserved the order in which those collected keyword arguments were passed to the function. In some contexts the order matters. This PEP introduces a mechanism by which the passed order of collected keyword arguments will now be preserved. Motivation ========== Python's \*\*kwargs syntax in function definitions provides a powerful means of dynamically handling keyword arguments. In some applications of the syntax (see _`Use Cases`), the semantics applied to the collected keyword arguments requires that order be preserved. Unsurprisingly, this is similar to how OrderedDict is related to dict. Currently to preserved the order you have to do so manually and separately from the actual function call. This involves building an ordered mapping, whether an OrderedDict or an iterable of 2-tuples, which is then passed as a single argument to the function. [#arg_unpacking]_ With the capability described in this PEP, that boilerplate would no longer be required. For comparision, currently:: kwargs = OrderedDict() kwargs['eggs'] = ... ... def spam(a, kwargs): ... and with this proposal:: def spam(a, **kwargs): ... Nick Coglan, speaking of some of the uses cases, summed it up well [#nick_obvious]_:: These *can* all be done today, but *not* by using keyword arguments. In my view, the problem to be addressed is that keyword arguments *look* like they should work for these cases, because they have a definite order in the source code. The only reason they don't work is because the interpreter throws that ordering information away. It's a textbook case of a language feature becoming an attractive nuisance in some circumstances: the simple and obvious solution for the above use cases *doesn't actually work* for reasons that aren't obviously clear if you don't have a firm grasp of Python's admittedly complicated argument handling. This observation is supported by the appearance of this proposal over the years and the numerous times that people have been confused by the constructor for OrderedDict. [#past_threads]_ [#loss_of_order]_ [#compact_dict]_ Use Cases ========= As Nick noted, the current behavior of \*\*kwargs is unintuitive in cases where one would expect order to matter. Aside from more specific cases outlined below, in general "anything else where you want to control the iteration order *and* set field names and values in a single call will potentially benefit." [#nick_general]_ That matters in the case of factories (e.g. __init__()) for ordered types. Serialization ------------- Obviously OrderedDict would benefit (both __init__() and update()) from ordered kwargs. However, the benefit also extends to serialization APIs [#nick_obvious]_:: In the context of serialisation, one key lesson we have learned is that arbitrary ordering is a problem when you want to minimise spurious diffs, and sorting isn't a simple solution. Tools like doctest don't tolerate spurious diffs at all, but are often amenable to a sorting based answer. The cases where it would be highly desirable to be able use keyword arguments to control the order of display of a collection of key value pairs are ones like: * printing out key:value pairs in CLI output * mapping semantic names to column order in a CSV * serialising attributes and elements in particular orders in XML * serialising map keys in particular orders in human readable formats like JSON and YAML (particularly when they're going to be placed under source control) Debugging --------- In the words of Raymond Hettinger [#raymond_debug]_:: It makes it easier to debug if the arguments show-up in the order they were created. AFAICT, no purpose is served by scrambling them. Other Use Cases --------------- * Mock objects. [#mock]_ * Controlling object presentation. * Alternate namedtuple() where defaults can be specified. * Specifying argument priority by order. Concerns ======== Performance ----------- As already noted, the idea of ordered keyword arguments has come up on a number of occasions. Each time it has been met with the same response, namely that preserving keyword arg order would have a sufficiently adverse effect on function call performance that it's not worth doing. However, Guido noted the following [#guido_open]_:: Making **kwds ordered is still open, but requires careful design and implementation to avoid slowing down function calls that don't benefit. As will be noted below, there are ways to work around this at the expense of increased complication. Ultimately the simplest approach is the one that makes the most sense: pack collected key word arguments into an OrderedDict. However, without a C implementation of OrderedDict there isn't much to discuss. That should change in Python 3.5. [#c_ordereddict]_ In some cases the difference of performance between dict and OrderedDict *may* be of significance. For instance: when the collected kwargs has an extended lifetime outside the originating function or the number of collected kwargs is massive. However, the difference in performance (both CPU and memory) in those cases should not be significant. Furthermore, the performance of the C OrderedDict implementation is essentially identical with dict for the non-mutating API. A concrete representation of the difference in performance will be a part of this proposal before its resolution. Other Python Implementations ---------------------------- Another important issue to consider is that new features must be cognizant of the multiple Python implementations. At some point each of them would be expected to have implemented ordered kwargs. In this regard there doesn't seem to be an issue with the idea. [#ironpython]_ Each of the major Python implementations will be consulted regarding this proposal before its resolution. Specification ============= Starting in version 3.5 Python will preserve the order of keyword arguments as passed to a function. This will apply only to functions for which the definition uses the \*\*kwargs syntax for collecting otherwise unspecified keyword arguments. Only the order of those keyword arguments will be preserved. Relationship to **-unpacking syntax ----------------------------------- The ** unpacking syntax in function calls has no special connection with this proposal. Keyword arguments provided by unpacking will be treated in exactly the same way as they are now: ones that match defined parameters are gather there and the remainder will be collected into the ordered kwargs (just like any other unmatched keyword argument). Note that unpacking a mapping with undefined order, such as dict, will preserve its iteration order like normal. It's just that the order will remain undefined. The OrderedDict into which the unpacked key-value pairs will then be packed will not be able to provide any alternate ordering. This should not be surprising. There have been brief discussions of simply passing these mappings through to the functions kwargs without unpacking and repacking them, but that is both outside the scope of this proposal and probably a bad idea regardless. (There is a reason those discussions were brief.) Relationship to inspect.Signature --------------------------------- Signature objects should need no changes. The `kwargs` parameter of inspect.BoundArguments (returned by Signature.bind() and Signature.bind_partial()) will change from a dict to an OrderedDict. C-API ----- TBD Syntax ------ No syntax is added or changed by this proposal. Backward-Compatibility ---------------------- The following will change: * type(kwargs) * iteration order of kwargs will now be consistent (except of course in the case described above) * as already noted, performance will be marginally different None of these should be an issue. However, each will be carefully considered while this proposal is under discussion. Reference Implementation ======================== TBD Implementation Notes -------------------- TBD Alternate Approaches ==================== Opt-out Decorator ----------------- This is identical to the current proposal with the exception that Python would also provide a decorator in functools that would cause collected keyword arguments to be packed into a normal dict instead of an OrderedDict. Prognosis: This would only be necessary if performance is determined to be significantly different in some uncommon cases or that there are other backward-compatibility concerns that cannot be resolved otherwise. Opt-in Decorator ---------------- The status quo would be unchanged. Instead Python would provide a decorator in functools that would register or mark the decorated function as one that should get ordered keyword arguments. The performance overhead to check the function at call time would be marginal. Prognosis: The only real down-side is in the case of function wrappers factories (e.g. functools.partial and many decorators) that aim to perfectly preserve keyword arguments by using kwargs in the wrapper definition and kwargs unpacking in the call to the wrapped function. Each wrapper would have to be updated separately, though having functools.wraps() do this automaticallywould help. __kworder__ ----------- The order of keyword arguments would be stored separately in a list at call time. The list would be bound to __kworder__ in the function locals. Prognosis: This likewise complicates the wrapper case. Compact dict with faster iteration ---------------------------------- Raymond Hettinger has introduced the idea of a dict implementation that would result in preserving insertion order on dicts (until the first deletion). This would be a perfect fit for kwargs. [#compact_dict]_ Prognosis: The idea is still uncertain in both viability and timeframe. ***kwargs --------- This would add a new form to a function's signature as a mutually exclusive parallel to \*\*kwargs. The new syntax, ***kwargs (note that there are three asterisks), would indicate that kwargs should preserve the order of keyword arguments. Prognosis: New syntax is only added to Python under the most *dire* circumstances. With other available solutions, new syntax is not justifiable. Furthermore, like all opt-in solutions, the new syntax would complicate the pass-through case. annotations ----------- This is a variation on the decorator approach. Instead of using a decorator to mark the function, you would use a function annotation on \*\*kwargs. Prognosis: In addition to the pass-through complication, annotations have been actively discouraged in Python core development. Use of annotations to opt-in to order preservation runs the risk of interfering with other application-level use of annotations. dict.__order__ -------------- dict objects would have a new attribute, `__order__` that would default to None and that in the kwargs case the interpreter would use in the same way as described above for __kworder__. Prognosis: It would mean zero impact on kwargs performance but the change would be pretty intrusive (Python uses dict a lot). Also, for the wrapper case the interpreter would have to be careful to preserve `__order__`. KWArgsDict.__order__ -------------------- This is the same as the `dict.__order__` idea, but kwargs would be an instance of a new minimal dict subclass that provides the `__order__` attribute. dict would instead be unchanged. Prognosis: Simply switching to OrderedDict is a less complicated and more intuitive change. Acknowledgements ================ Thanks to Andrew Barnert for helpful feedback and to the participants of all the past email threads. Footnotes ========= .. [#arg_unpacking] Alternately, you could also replace ** in your function definition with * and then pass in key/value 2-tuples. This has the advantage of not requiring the keys to be valid identifier strings. See https://mail.python.org/pipermail/python-ideas/2014-April/027491.html. References ========== .. [#nick_obvious] https://mail.python.org/pipermail/python-ideas/2014-April/027512.html .. [#past_threads] https://mail.python.org/pipermail/python-ideas/2009-April/004163.html https://mail.python.org/pipermail/python-ideas/2010-October/008445.html https://mail.python.org/pipermail/python-ideas/2011-January/009037.html https://mail.python.org/pipermail/python-ideas/2013-February/019690.html https://mail.python.org/pipermail/python-ideas/2013-May/020727.html https://mail.python.org/pipermail/python-ideas/2014-March/027225.html http://bugs.python.org/issue16276 http://bugs.python.org/issue16553 http://bugs.python.org/issue19026 http://bugs.python.org/issue5397#msg82972 .. [#loss_of_order] https://mail.python.org/pipermail/python-dev/2007-February/071310.html .. [#compact_dict] https://mail.python.org/pipermail/python-dev/2012-December/123028.html https://mail.python.org/pipermail/python-dev/2012-December/123105.html https://mail.python.org/pipermail/python-dev/2013-May/126327.html https://mail.python.org/pipermail/python-dev/2013-May/126328.html .. [#nick_general] https://mail.python.org/pipermail/python-dev/2012-December/123105.html .. [#raymond_debug] https://mail.python.org/pipermail/python-dev/2013-May/126327.html .. [#mock] https://mail.python.org/pipermail/python-ideas/2009-April/004163.html https://mail.python.org/pipermail/python-ideas/2009-April/004165.html https://mail.python.org/pipermail/python-ideas/2009-April/004175.html .. [guido_open] https://mail.python.org/pipermail/python-dev/2013-May/126404.html .. [#c_ordereddict] http://bugs.python.org/issue16991 .. [#ironpython] https://mail.python.org/pipermail/python-dev/2012-December/123100.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
I like it, but a couple comments: From: Eric Snow <ericsnowcurrently@gmail.com> Sent: Saturday, April 5, 2014 10:43 PM
Starting in version 3.5 Python will preserve the order of keyword arguments as passed to a function. This will apply only to functions for which the definition uses the \*\*kwargs syntax for collecting otherwise unspecified keyword arguments. Only the order of those keyword arguments will be preserved.
Will this be an OrderedDict, or just some mapping that (unless later modified?) iterates in the order of the keywords as passed? The latter might be nice, because it allows implementations to do things like use Raymond Hettinger's special dict, or a wrapper around a native mapping, or something else that isn't appropriate for an OrderedDict implementation but is good enough for this purpose. But I'm not sure how you'd word that in the language reference; the two relevant sentences are already pretty long and complex today (in sections 6.3.4 and 8.6):
If any keyword argument does not correspond to a formal parameter name, a TypeError exception is raised, unless a formal parameter using the syntax **identifier is present; in this case, that formal parameter receives a dictionary containing the excess keyword arguments (using the keywords as keys and the argument values as corresponding values), or a (new) empty dictionary if there were no excess keyword arguments.
If the form “**identifier” is present, it is initialized to a new dictionary receiving any excess keyword arguments, defaulting to a new empty dictionary. Parameters after “*” or “*identifier” are keyword-only parameters and may only be passed used keyword arguments.
Also:
Relationship to **-unpacking syntax
-----------------------------------
The ** unpacking syntax in function calls has no special connection with this proposal. Keyword arguments provided by unpacking will be treated in exactly the same way as they are now: ones that match defined parameters are gather there and the remainder will be collected into the ordered kwargs (just like any other unmatched keyword argument).
I think you want to explicitly specify that they will be processed after any normal keyword arguments, and in the mapping's iteration order. (Otherwise, partial, perfect-forwarding wrappers, etc. would have obvious problems.) That isn't specified by Python 3.4; the language reference (6.3.4 again, a few paragraphs down) says:
If the syntax **expression appears in the function call, expression must evaluate to a mapping, the contents of which are treated as additional keyword arguments.
CPython 3.4, if you just changed it to use an OrderedDict instead of a dict, would put the expression's contents _before_ the normal keywords. And there are other reasonable ways to implement things that would be perfectly valid under the current language definition that might end up with the mapping's contents in reverse order, or even arbitrary order. If you want to force those implementations to change (which you obviously do), I think the language reference should reflect that. Finally:
Opt-out Decorator -----------------
This is identical to the current proposal with the exception that Python would also provide a decorator in functools that would cause collected keyword arguments to be packed into a normal dict instead of an OrderedDict.
I think you may need a bit more of an argument against this, because there are really two issues here. First, there's the everyday case: every function that takes **kwargs will get a little slower. The optimized C OrderedDict will hopefully make this not significant enough to worry about; if not, the opt-out decorator may be necessary. Second, there's Guido's case: functions that keep kwargs around for later will be potentially confusing, conceptually wrong, and possibly lead to significantly less speed- or memory-efficient code, especially if they later add a whole bunch of other stuff to the stored kwargs dict. The C OrderedDict isn't going to help here. The opt-out decorator would, but just storing dict(kwargs) instead of kwargs solves it more explicitly, just as simply, and with no more changes to existing code. So, the only reason the opt-out decorator would be necessary for this case is if the cost of that dict(kwargs) is too high. Which seems unlikely for most realistic cases—the performance issue is about adding thousands of items later to an initially-small kwargs dict, not about receiving thousands of keywords, right? On the other hand, I understand Nick's point about not trying to answer arguments that people may never make in your PEP, so maybe it's better to leave this part as-is.
On 6 April 2014 15:43, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Here's the proper proposal I promised. Hopefully it reflects our discussions and the past ones too. My goal here is to focus in on what I consider to be the most viable approach, which capturing the gist of the various concerns and alternatives. There are still a few small gaps that I will work on as time permits. Feedback is welcome.
Nice! One specific comment below.
Relationship to **-unpacking syntax -----------------------------------
The ** unpacking syntax in function calls has no special connection with this proposal. Keyword arguments provided by unpacking will be treated in exactly the same way as they are now: ones that match defined parameters are gather there and the remainder will be collected into the ordered kwargs (just like any other unmatched keyword argument).
There *is* a connection here: this needs to play nice with the ordered kwargs in order to handle the pass-through case correctly. The guarantee that the order of the supplied mapping (minus any entries that map to named parameters) will be preserved needs to be made explicit. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Apr 6, 2014 4:39 AM, "Nick Coghlan" <ncoghlan@gmail.com> wrote:
There *is* a connection here: this needs to play nice with the ordered kwargs in order to handle the pass-through case correctly.
The guarantee that the order of the supplied mapping (minus any entries that map to named parameters) will be preserved needs to be made explicit.
You're right. It's already implied by OrderedDict. However the point *should* be explicit. -eric
On Sun, Apr 6, 2014 at 9:34 AM, Eric Snow <ericsnowcurrently@gmail.com>wrote:
On Apr 6, 2014 4:39 AM, "Nick Coghlan" <ncoghlan@gmail.com> wrote:
There *is* a connection here: this needs to play nice with the ordered kwargs in order to handle the pass-through case correctly.
The guarantee that the order of the supplied mapping (minus any entries that map to named parameters) will be preserved needs to be made explicit.
You're right. It's already implied by OrderedDict. However the point *should* be explicit.
Somehow the specification section in the PEP doesn't explicitly say it, but I presume that the type of kwargs will change from dict to OrderedDict? That seems simple enough, but the proof is in the pudding. How simple was it to implement? How did the speed compare? Until we have those answers we shouldn't accept this PEP. -- --Guido van Rossum (python.org/~guido)
On Apr 6, 2014 11:37 AM, "Guido van Rossum" <guido@python.org> wrote:
Somehow the specification section in the PEP doesn't explicitly say it, but I presume that the type of kwargs will change from dict to OrderedDict? That seems simple enough, but the proof is in the pudding. How simple was it to implement? How did the speed compare? Until we have those answers we shouldn't accept this PEP.
It's just a minor detail. <wink> You are correct, of course. I'll fix the PEP. I'll also post a performance comparison here once I have something in the next few days. -eric
Also, I recommend you go ahead and commit it to the peps repo. Pick the next available number (looks like 468). Be sure that "make pep-0468.html pep-0000.html" doesn't emit any errors. On Sun, Apr 6, 2014 at 12:06 PM, Eric Snow <ericsnowcurrently@gmail.com>wrote:
On Apr 6, 2014 11:37 AM, "Guido van Rossum" <guido@python.org> wrote:
Somehow the specification section in the PEP doesn't explicitly say it, but I presume that the type of kwargs will change from dict to OrderedDict? That seems simple enough, but the proof is in the pudding. How simple was it to implement? How did the speed compare? Until we have those answers we shouldn't accept this PEP.
It's just a minor detail. <wink> You are correct, of course. I'll fix the PEP. I'll also post a performance comparison here once I have something in the next few days.
-eric
-- --Guido van Rossum (python.org/~guido)
On Sun, Apr 6, 2014 at 5:14 PM, Guido van Rossum <guido@python.org> wrote:
Also, I recommend you go ahead and commit it to the peps repo. Pick the next available number (looks like 468). Be sure that "make pep-0468.html pep-0000.html" doesn't emit any errors.
Updated and committed. PEP 468 it is. Thanks. -eric
Nice PEP; I'm sorry we didn't get a chance to discuss it at the language summit. On Apr 05, 2014, at 11:43 PM, Eric Snow wrote:
Alternate Approaches ====================
Opt-out Decorator -----------------
This is identical to the current proposal with the exception that Python would also provide a decorator in functools that would cause collected keyword arguments to be packed into a normal dict instead of an OrderedDict.
Prognosis:
This would only be necessary if performance is determined to be significantly different in some uncommon cases or that there are other backward-compatibility concerns that cannot be resolved otherwise.
Opt-in Decorator ----------------
The status quo would be unchanged. Instead Python would provide a decorator in functools that would register or mark the decorated function as one that should get ordered keyword arguments. The performance overhead to check the function at call time would be marginal.
Prognosis:
The only real down-side is in the case of function wrappers factories (e.g. functools.partial and many decorators) that aim to perfectly preserve keyword arguments by using kwargs in the wrapper definition and kwargs unpacking in the call to the wrapped function. Each wrapper would have to be updated separately, though having functools.wraps() do this automaticallywould help.
Since this is touching a fundamental and common aspect of the language, I think a lot of analysis has to be done for both performance and semantics if the feature is opt-out. People do crazy things and even small performance hits add up. The use cases are valid but rare I submit, so I think most Python code doesn't care. Thus I'd lean toward an opt-in approach. Perhaps a transition period makes sense. Add both opt-in and opt-out decorators, implement opt-in by default for Python 3.5 with a -X option to switch it to opt-out. Cheers, -Barry
On Thu, Apr 10, 2014 at 9:05 AM, Barry Warsaw <barry@python.org> wrote:
Nice PEP; I'm sorry we didn't get a chance to discuss it at the language summit.
Yeah, I didn't even bother mentioning it because I didn't have any performance numbers yet. I Hope to have some today, however.
Since this is touching a fundamental and common aspect of the language, I think a lot of analysis has to be done for both performance and semantics if the feature is opt-out.
The proposal currently doesn't even involve opting anything. :) Opt-out is my fallback proposal in case performance has significant change when "people do crazy things".
People do crazy things and even small performance hits add up. The use cases are valid but rare I submit, so I think most Python code doesn't care. Thus I'd lean toward an opt-in approach.
The problem is that opt-in causes all sorts of complication due to some wrapper factories like functools.partial and many decorators. They take **kwargs, and then unpack them in the wrapped call, so the wrapper definition would have to know how to opt-in when the wrappee has opted in. Andrew Barnert and I discussed this at length in the previous python-ideas thread. My conclusion was that we should avoid opt-in if at all possible.
Perhaps a transition period makes sense. Add both opt-in and opt-out decorators, implement opt-in by default for Python 3.5 with a -X option to switch it to opt-out.
If we could work out the opt-in complications that could work. However, I just don't think opt-in will be worth the trouble. That said, right now we're mostly just guessing about the performance impact. :) I hope to have a better answer to that question soon. -eric
Sorry to be a spoil-sport, but what next? Preserve order of globals? Preserve order of locals? Preserve order of class members? methinks the whole point of keyword arguments is that they are named. order is a much weaker concept. d; my 2c. On 6 April 2014 07:43, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Here's the proper proposal I promised. Hopefully it reflects our discussions and the past ones too. My goal here is to focus in on what I consider to be the most viable approach, which capturing the gist of the various concerns and alternatives. There are still a few small gaps that I will work on as time permits. Feedback is welcome.
FYI, I also hope to have a basic implementation up in time for the language summit at PyCon (Wednesday) so that related discussions there might have something more concrete to support them. :)
-eric
=====================================================================
PEP: XXX Title: Preserving the order of \*\*kwargs in a function. Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow <ericsnowcurrently@gmail.com> Discussions-To: python-ideas@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 5-Apr-2014 Python-Version: 3.5 Post-History: Resolution:
Abstract ========
The \*\*kwargs syntax in a function definition indicates that the interpreter should collect all keyword arguments that do not correspond to other named parameters. However, Python does not preserved the order in which those collected keyword arguments were passed to the function. In some contexts the order matters. This PEP introduces a mechanism by which the passed order of collected keyword arguments will now be preserved.
Motivation ==========
Python's \*\*kwargs syntax in function definitions provides a powerful means of dynamically handling keyword arguments. In some applications of the syntax (see _`Use Cases`), the semantics applied to the collected keyword arguments requires that order be preserved. Unsurprisingly, this is similar to how OrderedDict is related to dict.
Currently to preserved the order you have to do so manually and separately from the actual function call. This involves building an ordered mapping, whether an OrderedDict or an iterable of 2-tuples, which is then passed as a single argument to the function. [#arg_unpacking]_
With the capability described in this PEP, that boilerplate would no longer be required.
For comparision, currently::
kwargs = OrderedDict() kwargs['eggs'] = ... ... def spam(a, kwargs): ...
and with this proposal::
def spam(a, **kwargs): ...
Nick Coglan, speaking of some of the uses cases, summed it up well [#nick_obvious]_::
These *can* all be done today, but *not* by using keyword arguments. In my view, the problem to be addressed is that keyword arguments *look* like they should work for these cases, because they have a definite order in the source code. The only reason they don't work is because the interpreter throws that ordering information away.
It's a textbook case of a language feature becoming an attractive nuisance in some circumstances: the simple and obvious solution for the above use cases *doesn't actually work* for reasons that aren't obviously clear if you don't have a firm grasp of Python's admittedly complicated argument handling.
This observation is supported by the appearance of this proposal over the years and the numerous times that people have been confused by the constructor for OrderedDict. [#past_threads]_ [#loss_of_order]_ [#compact_dict]_
Use Cases =========
As Nick noted, the current behavior of \*\*kwargs is unintuitive in cases where one would expect order to matter. Aside from more specific cases outlined below, in general "anything else where you want to control the iteration order *and* set field names and values in a single call will potentially benefit." [#nick_general]_ That matters in the case of factories (e.g. __init__()) for ordered types.
Serialization -------------
Obviously OrderedDict would benefit (both __init__() and update()) from ordered kwargs. However, the benefit also extends to serialization APIs [#nick_obvious]_::
In the context of serialisation, one key lesson we have learned is that arbitrary ordering is a problem when you want to minimise spurious diffs, and sorting isn't a simple solution.
Tools like doctest don't tolerate spurious diffs at all, but are often amenable to a sorting based answer.
The cases where it would be highly desirable to be able use keyword arguments to control the order of display of a collection of key value pairs are ones like:
* printing out key:value pairs in CLI output * mapping semantic names to column order in a CSV * serialising attributes and elements in particular orders in XML * serialising map keys in particular orders in human readable formats like JSON and YAML (particularly when they're going to be placed under source control)
Debugging ---------
In the words of Raymond Hettinger [#raymond_debug]_::
It makes it easier to debug if the arguments show-up in the order they were created. AFAICT, no purpose is served by scrambling them.
Other Use Cases ---------------
* Mock objects. [#mock]_ * Controlling object presentation. * Alternate namedtuple() where defaults can be specified. * Specifying argument priority by order.
Concerns ========
Performance -----------
As already noted, the idea of ordered keyword arguments has come up on a number of occasions. Each time it has been met with the same response, namely that preserving keyword arg order would have a sufficiently adverse effect on function call performance that it's not worth doing. However, Guido noted the following [#guido_open]_::
Making **kwds ordered is still open, but requires careful design and implementation to avoid slowing down function calls that don't benefit.
As will be noted below, there are ways to work around this at the expense of increased complication. Ultimately the simplest approach is the one that makes the most sense: pack collected key word arguments into an OrderedDict. However, without a C implementation of OrderedDict there isn't much to discuss. That should change in Python 3.5. [#c_ordereddict]_
In some cases the difference of performance between dict and OrderedDict *may* be of significance. For instance: when the collected kwargs has an extended lifetime outside the originating function or the number of collected kwargs is massive. However, the difference in performance (both CPU and memory) in those cases should not be significant. Furthermore, the performance of the C OrderedDict implementation is essentially identical with dict for the non-mutating API. A concrete representation of the difference in performance will be a part of this proposal before its resolution.
Other Python Implementations ----------------------------
Another important issue to consider is that new features must be cognizant of the multiple Python implementations. At some point each of them would be expected to have implemented ordered kwargs. In this regard there doesn't seem to be an issue with the idea. [#ironpython]_ Each of the major Python implementations will be consulted regarding this proposal before its resolution.
Specification =============
Starting in version 3.5 Python will preserve the order of keyword arguments as passed to a function. This will apply only to functions for which the definition uses the \*\*kwargs syntax for collecting otherwise unspecified keyword arguments. Only the order of those keyword arguments will be preserved.
Relationship to **-unpacking syntax -----------------------------------
The ** unpacking syntax in function calls has no special connection with this proposal. Keyword arguments provided by unpacking will be treated in exactly the same way as they are now: ones that match defined parameters are gather there and the remainder will be collected into the ordered kwargs (just like any other unmatched keyword argument).
Note that unpacking a mapping with undefined order, such as dict, will preserve its iteration order like normal. It's just that the order will remain undefined. The OrderedDict into which the unpacked key-value pairs will then be packed will not be able to provide any alternate ordering. This should not be surprising.
There have been brief discussions of simply passing these mappings through to the functions kwargs without unpacking and repacking them, but that is both outside the scope of this proposal and probably a bad idea regardless. (There is a reason those discussions were brief.)
Relationship to inspect.Signature ---------------------------------
Signature objects should need no changes. The `kwargs` parameter of inspect.BoundArguments (returned by Signature.bind() and Signature.bind_partial()) will change from a dict to an OrderedDict.
C-API -----
TBD
Syntax ------
No syntax is added or changed by this proposal.
Backward-Compatibility ----------------------
The following will change:
* type(kwargs) * iteration order of kwargs will now be consistent (except of course in the case described above) * as already noted, performance will be marginally different
None of these should be an issue. However, each will be carefully considered while this proposal is under discussion.
Reference Implementation ========================
TBD
Implementation Notes --------------------
TBD
Alternate Approaches ====================
Opt-out Decorator -----------------
This is identical to the current proposal with the exception that Python would also provide a decorator in functools that would cause collected keyword arguments to be packed into a normal dict instead of an OrderedDict.
Prognosis:
This would only be necessary if performance is determined to be significantly different in some uncommon cases or that there are other backward-compatibility concerns that cannot be resolved otherwise.
Opt-in Decorator ----------------
The status quo would be unchanged. Instead Python would provide a decorator in functools that would register or mark the decorated function as one that should get ordered keyword arguments. The performance overhead to check the function at call time would be marginal.
Prognosis:
The only real down-side is in the case of function wrappers factories (e.g. functools.partial and many decorators) that aim to perfectly preserve keyword arguments by using kwargs in the wrapper definition and kwargs unpacking in the call to the wrapped function. Each wrapper would have to be updated separately, though having functools.wraps() do this automaticallywould help.
__kworder__ -----------
The order of keyword arguments would be stored separately in a list at call time. The list would be bound to __kworder__ in the function locals.
Prognosis:
This likewise complicates the wrapper case.
Compact dict with faster iteration ----------------------------------
Raymond Hettinger has introduced the idea of a dict implementation that would result in preserving insertion order on dicts (until the first deletion). This would be a perfect fit for kwargs. [#compact_dict]_
Prognosis:
The idea is still uncertain in both viability and timeframe.
***kwargs ---------
This would add a new form to a function's signature as a mutually exclusive parallel to \*\*kwargs. The new syntax, ***kwargs (note that there are three asterisks), would indicate that kwargs should preserve the order of keyword arguments.
Prognosis:
New syntax is only added to Python under the most *dire* circumstances. With other available solutions, new syntax is not justifiable. Furthermore, like all opt-in solutions, the new syntax would complicate the pass-through case.
annotations -----------
This is a variation on the decorator approach. Instead of using a decorator to mark the function, you would use a function annotation on \*\*kwargs.
Prognosis:
In addition to the pass-through complication, annotations have been actively discouraged in Python core development. Use of annotations to opt-in to order preservation runs the risk of interfering with other application-level use of annotations.
dict.__order__ --------------
dict objects would have a new attribute, `__order__` that would default to None and that in the kwargs case the interpreter would use in the same way as described above for __kworder__.
Prognosis:
It would mean zero impact on kwargs performance but the change would be pretty intrusive (Python uses dict a lot). Also, for the wrapper case the interpreter would have to be careful to preserve `__order__`.
KWArgsDict.__order__ --------------------
This is the same as the `dict.__order__` idea, but kwargs would be an instance of a new minimal dict subclass that provides the `__order__` attribute. dict would instead be unchanged.
Prognosis:
Simply switching to OrderedDict is a less complicated and more intuitive change.
Acknowledgements ================
Thanks to Andrew Barnert for helpful feedback and to the participants of all the past email threads.
Footnotes =========
.. [#arg_unpacking]
Alternately, you could also replace ** in your function definition with * and then pass in key/value 2-tuples. This has the advantage of not requiring the keys to be valid identifier strings. See https://mail.python.org/pipermail/python-ideas/2014-April/027491.html.
References ==========
.. [#nick_obvious] https://mail.python.org/pipermail/python-ideas/2014-April/027512.html
.. [#past_threads] https://mail.python.org/pipermail/python-ideas/2009-April/004163.html https://mail.python.org/pipermail/python-ideas/2010-October/008445.html https://mail.python.org/pipermail/python-ideas/2011-January/009037.html https://mail.python.org/pipermail/python-ideas/2013-February/019690.html https://mail.python.org/pipermail/python-ideas/2013-May/020727.html https://mail.python.org/pipermail/python-ideas/2014-March/027225.html http://bugs.python.org/issue16276 http://bugs.python.org/issue16553 http://bugs.python.org/issue19026 http://bugs.python.org/issue5397#msg82972
.. [#loss_of_order] https://mail.python.org/pipermail/python-dev/2007-February/071310.html
.. [#compact_dict] https://mail.python.org/pipermail/python-dev/2012-December/123028.html https://mail.python.org/pipermail/python-dev/2012-December/123105.html https://mail.python.org/pipermail/python-dev/2013-May/126327.html https://mail.python.org/pipermail/python-dev/2013-May/126328.html
.. [#nick_general] https://mail.python.org/pipermail/python-dev/2012-December/123105.html
.. [#raymond_debug] https://mail.python.org/pipermail/python-dev/2013-May/126327.html
.. [#mock] https://mail.python.org/pipermail/python-ideas/2009-April/004163.html https://mail.python.org/pipermail/python-ideas/2009-April/004165.html https://mail.python.org/pipermail/python-ideas/2009-April/004175.html
.. [guido_open] https://mail.python.org/pipermail/python-dev/2013-May/126404.html
.. [#c_ordereddict] http://bugs.python.org/issue16991
.. [#ironpython] https://mail.python.org/pipermail/python-dev/2012-December/123100.html
Copyright =========
This document has been placed in the public domain.
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, Apr 10, 2014 at 6:57 PM, Dima Tisnek <dimaqq@gmail.com> wrote:
Sorry to be a spoil-sport, but what next?
Preserve order of globals? Preserve order of locals? Preserve order of class members?
But these are all currently possible... PEP 3115 explicitly mentions preserving order of class members as a motivation for __prepare__. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
On Apr 10, 2014 1:57 PM, "Dima Tisnek" <dimaqq@gmail.com> wrote:
Sorry to be a spoil-sport, but what next?
Oh this isn't so bad :)
Preserve order of globals?
Already do-able.
Preserve order of locals?
Function execution stands alone in Python in how explicitly we restrict flexibility to customize (for the sake of performance). However, see the next point.
Preserve order of class members?
As already noted, you can do so now. Furthermore, you can use that mechanism to preserve the order of function "locals".
methinks the whole point of keyword arguments is that they are named.
order is a much weaker concept.
I agree it is weaker. I still think it is worth preserving, as explained in the PEP. -eric
Wouldn't another, much less invasive, much more explicit approach be to not squash subclasses of dict when passed into **args? IOW, the way to preserve order would be:
def foo(**kws): ... print(kws) ... from collections import OrderedDict as od foo(**od(a=1, b=2, c=3)) OrderedDict([('a', 1), ('c', 3), ('b', 2)]) # instead of: {'a': 1, 'c': 3, 'b': 2}
? Okay, if you call it with keyword syntax, e.g.
foo(a=3, b=2, c=1) {'c': 1, 'a': 3, 'b': 2}
but oh well. -Barry
On 10 April 2014 23:04, Barry Warsaw <barry@python.org> wrote:
Wouldn't another, much less invasive, much more explicit approach be to not squash subclasses of dict when passed into **args?
Doesn't this fail to support the only significant use case - keyword argument syntax for the OrderedDict constructor...? Paul
On Thu, Apr 10, 2014 at 4:16 PM, Paul Moore <p.f.moore@gmail.com> wrote:
On 10 April 2014 23:04, Barry Warsaw <barry@python.org> wrote:
Wouldn't another, much less invasive, much more explicit approach be to not squash subclasses of dict when passed into **args?
Doesn't this fail to support the only significant use case - keyword argument syntax for the OrderedDict constructor...?
While perhaps the first one people think of, I wouldn't say it's the only significant one. The PEP outlines several that are worth supporting better. While these aren't every-day annoyances, neither is this a proposal for new syntax. The only real downside is the performance on functions that have **kwargs in their signature, and even that has not been proven or disproven as a concern (I'm having computer problems so I haven't been able to get any concrete data). The last time I checked, most C OrderedDict operations were about the same as dict and the worst one took 4x as long. So just to make this clear, this proposal will only impact performance for calling a small subset of functions and I anticipate that impact will be small. -eric
On Thu, Apr 10, 2014 at 4:04 PM, Barry Warsaw <barry@python.org> wrote:
Wouldn't another, much less invasive, much more explicit approach be to not squash subclasses of dict when passed into **args?
IOW, the way to preserve order would be:
def foo(**kws): ... print(kws) ... from collections import OrderedDict as od foo(**od(a=1, b=2, c=3)) OrderedDict([('a', 1), ('c', 3), ('b', 2)]) # instead of: {'a': 1, 'c': 3, 'b': 2}
?
So preserve the type of the object passed in? Perhaps. However, using keyword args for OrderedDict loses order, so that would put us right back at the status quo anyway!
Okay, if you call it with keyword syntax, e.g.
foo(a=3, b=2, c=1) {'c': 1, 'a': 3, 'b': 2}
Which is the main use case (rather than that of handling just unpacked args--which incidentally is a more complicated proposal). -eric
From: Barry Warsaw <barry@python.org> Sent: Thursday, April 10, 2014 3:04 PM
Wouldn't another, much less invasive, much more explicit approach be to not squash subclasses of dict when passed into **args?
IOW, the way to preserve order would be:
def foo(**kws): ... print(kws) ... from collections import OrderedDict as od foo(**od(a=1, b=2, c=3)) OrderedDict([('a', 1), ('c', 3), ('b', 2)]) # instead of: {'a': 1, 'c': 3, 'b': 2}
I think your own example shows why this doesn't work. You wanted to pass in a=1, b=2, c=3 by passing it through an OrderedDict… but you ended up passing a=1, c=3, b=2 instead. So you've successfully preserved _an_ order in kws, but not the one you wanted. (Also, a **kws parameter doesn't get a squashed copy of the **d argument in the first place; it gets a new dict, into which things get copied. And the things that get copied aren't the members of d unless both exist, and there are no members of d that match named parameters, and there are no keyword arguments that don't match named parameters. So, I don't think this suggestion is even a coherent solution, much less a successful one. But you could obviously fix that by saying that, e.g., you create a new type(d) instead of a new dict, and then add things to that normally.)
On Apr 10, 2014, at 08:20 PM, Andrew Barnert wrote:
I think your own example shows why this doesn't work.
Not really. It was just a silly example for conciseness. I don't care how you build up the OrderedDict, and there are several order-preserving ways to do that. I'm not sure it's a good idea to change *all of Python* just to make the OD constructor nicer. -Barry
On 11 Apr 2014 07:00, "Barry Warsaw" <barry@python.org> wrote:
On Apr 10, 2014, at 08:20 PM, Andrew Barnert wrote:
I think your own example shows why this doesn't work.
Not really. It was just a silly example for conciseness. I don't care
how
you build up the OrderedDict, and there are several order-preserving ways to do that. I'm not sure it's a good idea to change *all of Python* just to make the OD constructor nicer.
It's not "all of Python" - it's only functions that accept **kwargs. And when your code is speed critical, doing that's already a bad idea. Cheers, Nick.
-Barry
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Apr 11, 2014 at 7:49 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Not really. It was just a silly example for conciseness. I don't care how you build up the OrderedDict, and there are several order-preserving ways to do that. I'm not sure it's a good idea to change *all of Python* just to make the OD constructor nicer.
It's not "all of Python" - it's only functions that accept **kwargs. And when your code is speed critical, doing that's already a bad idea.
And it is not just about OrderedDict. There are many other examples where it is helpful to be able to supply arbitrary keyword arguments and have the order preserved. For example, when building a memory view to a C struct, a nice interface would be struct_view(foo=int, bar=float). Another example is constructing tables. One may want to supply column info in keyword arguments named as columns.
participants (10)
-
Alexander Belopolsky
-
Andrew Barnert
-
Barry Warsaw
-
Dima Tisnek
-
Eric Snow
-
Ethan Furman
-
Guido van Rossum
-
Nathaniel Smith
-
Nick Coghlan
-
Paul Moore