Here's the proper proposal I promised. Hopefully it reflects our discussions and the past ones too. My goal here is to focus in on what I consider to be the most viable approach, which capturing the gist of the various concerns and alternatives. There are still a few small gaps that I will work on as time permits. Feedback is welcome. FYI, I also hope to have a basic implementation up in time for the language summit at PyCon (Wednesday) so that related discussions there might have something more concrete to support them. :) -eric ===================================================================== PEP: XXX Title: Preserving the order of \*\*kwargs in a function. Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow <ericsnowcurrently@gmail.com> Discussions-To: python-ideas@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 5-Apr-2014 Python-Version: 3.5 Post-History: Resolution: Abstract ======== The \*\*kwargs syntax in a function definition indicates that the interpreter should collect all keyword arguments that do not correspond to other named parameters. However, Python does not preserved the order in which those collected keyword arguments were passed to the function. In some contexts the order matters. This PEP introduces a mechanism by which the passed order of collected keyword arguments will now be preserved. Motivation ========== Python's \*\*kwargs syntax in function definitions provides a powerful means of dynamically handling keyword arguments. In some applications of the syntax (see _`Use Cases`), the semantics applied to the collected keyword arguments requires that order be preserved. Unsurprisingly, this is similar to how OrderedDict is related to dict. Currently to preserved the order you have to do so manually and separately from the actual function call. This involves building an ordered mapping, whether an OrderedDict or an iterable of 2-tuples, which is then passed as a single argument to the function. [#arg_unpacking]_ With the capability described in this PEP, that boilerplate would no longer be required. For comparision, currently:: kwargs = OrderedDict() kwargs['eggs'] = ... ... def spam(a, kwargs): ... and with this proposal:: def spam(a, **kwargs): ... Nick Coglan, speaking of some of the uses cases, summed it up well [#nick_obvious]_:: These *can* all be done today, but *not* by using keyword arguments. In my view, the problem to be addressed is that keyword arguments *look* like they should work for these cases, because they have a definite order in the source code. The only reason they don't work is because the interpreter throws that ordering information away. It's a textbook case of a language feature becoming an attractive nuisance in some circumstances: the simple and obvious solution for the above use cases *doesn't actually work* for reasons that aren't obviously clear if you don't have a firm grasp of Python's admittedly complicated argument handling. This observation is supported by the appearance of this proposal over the years and the numerous times that people have been confused by the constructor for OrderedDict. [#past_threads]_ [#loss_of_order]_ [#compact_dict]_ Use Cases ========= As Nick noted, the current behavior of \*\*kwargs is unintuitive in cases where one would expect order to matter. Aside from more specific cases outlined below, in general "anything else where you want to control the iteration order *and* set field names and values in a single call will potentially benefit." [#nick_general]_ That matters in the case of factories (e.g. __init__()) for ordered types. Serialization ------------- Obviously OrderedDict would benefit (both __init__() and update()) from ordered kwargs. However, the benefit also extends to serialization APIs [#nick_obvious]_:: In the context of serialisation, one key lesson we have learned is that arbitrary ordering is a problem when you want to minimise spurious diffs, and sorting isn't a simple solution. Tools like doctest don't tolerate spurious diffs at all, but are often amenable to a sorting based answer. The cases where it would be highly desirable to be able use keyword arguments to control the order of display of a collection of key value pairs are ones like: * printing out key:value pairs in CLI output * mapping semantic names to column order in a CSV * serialising attributes and elements in particular orders in XML * serialising map keys in particular orders in human readable formats like JSON and YAML (particularly when they're going to be placed under source control) Debugging --------- In the words of Raymond Hettinger [#raymond_debug]_:: It makes it easier to debug if the arguments show-up in the order they were created. AFAICT, no purpose is served by scrambling them. Other Use Cases --------------- * Mock objects. [#mock]_ * Controlling object presentation. * Alternate namedtuple() where defaults can be specified. * Specifying argument priority by order. Concerns ======== Performance ----------- As already noted, the idea of ordered keyword arguments has come up on a number of occasions. Each time it has been met with the same response, namely that preserving keyword arg order would have a sufficiently adverse effect on function call performance that it's not worth doing. However, Guido noted the following [#guido_open]_:: Making **kwds ordered is still open, but requires careful design and implementation to avoid slowing down function calls that don't benefit. As will be noted below, there are ways to work around this at the expense of increased complication. Ultimately the simplest approach is the one that makes the most sense: pack collected key word arguments into an OrderedDict. However, without a C implementation of OrderedDict there isn't much to discuss. That should change in Python 3.5. [#c_ordereddict]_ In some cases the difference of performance between dict and OrderedDict *may* be of significance. For instance: when the collected kwargs has an extended lifetime outside the originating function or the number of collected kwargs is massive. However, the difference in performance (both CPU and memory) in those cases should not be significant. Furthermore, the performance of the C OrderedDict implementation is essentially identical with dict for the non-mutating API. A concrete representation of the difference in performance will be a part of this proposal before its resolution. Other Python Implementations ---------------------------- Another important issue to consider is that new features must be cognizant of the multiple Python implementations. At some point each of them would be expected to have implemented ordered kwargs. In this regard there doesn't seem to be an issue with the idea. [#ironpython]_ Each of the major Python implementations will be consulted regarding this proposal before its resolution. Specification ============= Starting in version 3.5 Python will preserve the order of keyword arguments as passed to a function. This will apply only to functions for which the definition uses the \*\*kwargs syntax for collecting otherwise unspecified keyword arguments. Only the order of those keyword arguments will be preserved. Relationship to **-unpacking syntax ----------------------------------- The ** unpacking syntax in function calls has no special connection with this proposal. Keyword arguments provided by unpacking will be treated in exactly the same way as they are now: ones that match defined parameters are gather there and the remainder will be collected into the ordered kwargs (just like any other unmatched keyword argument). Note that unpacking a mapping with undefined order, such as dict, will preserve its iteration order like normal. It's just that the order will remain undefined. The OrderedDict into which the unpacked key-value pairs will then be packed will not be able to provide any alternate ordering. This should not be surprising. There have been brief discussions of simply passing these mappings through to the functions kwargs without unpacking and repacking them, but that is both outside the scope of this proposal and probably a bad idea regardless. (There is a reason those discussions were brief.) Relationship to inspect.Signature --------------------------------- Signature objects should need no changes. The `kwargs` parameter of inspect.BoundArguments (returned by Signature.bind() and Signature.bind_partial()) will change from a dict to an OrderedDict. C-API ----- TBD Syntax ------ No syntax is added or changed by this proposal. Backward-Compatibility ---------------------- The following will change: * type(kwargs) * iteration order of kwargs will now be consistent (except of course in the case described above) * as already noted, performance will be marginally different None of these should be an issue. However, each will be carefully considered while this proposal is under discussion. Reference Implementation ======================== TBD Implementation Notes -------------------- TBD Alternate Approaches ==================== Opt-out Decorator ----------------- This is identical to the current proposal with the exception that Python would also provide a decorator in functools that would cause collected keyword arguments to be packed into a normal dict instead of an OrderedDict. Prognosis: This would only be necessary if performance is determined to be significantly different in some uncommon cases or that there are other backward-compatibility concerns that cannot be resolved otherwise. Opt-in Decorator ---------------- The status quo would be unchanged. Instead Python would provide a decorator in functools that would register or mark the decorated function as one that should get ordered keyword arguments. The performance overhead to check the function at call time would be marginal. Prognosis: The only real down-side is in the case of function wrappers factories (e.g. functools.partial and many decorators) that aim to perfectly preserve keyword arguments by using kwargs in the wrapper definition and kwargs unpacking in the call to the wrapped function. Each wrapper would have to be updated separately, though having functools.wraps() do this automaticallywould help. __kworder__ ----------- The order of keyword arguments would be stored separately in a list at call time. The list would be bound to __kworder__ in the function locals. Prognosis: This likewise complicates the wrapper case. Compact dict with faster iteration ---------------------------------- Raymond Hettinger has introduced the idea of a dict implementation that would result in preserving insertion order on dicts (until the first deletion). This would be a perfect fit for kwargs. [#compact_dict]_ Prognosis: The idea is still uncertain in both viability and timeframe. ***kwargs --------- This would add a new form to a function's signature as a mutually exclusive parallel to \*\*kwargs. The new syntax, ***kwargs (note that there are three asterisks), would indicate that kwargs should preserve the order of keyword arguments. Prognosis: New syntax is only added to Python under the most *dire* circumstances. With other available solutions, new syntax is not justifiable. Furthermore, like all opt-in solutions, the new syntax would complicate the pass-through case. annotations ----------- This is a variation on the decorator approach. Instead of using a decorator to mark the function, you would use a function annotation on \*\*kwargs. Prognosis: In addition to the pass-through complication, annotations have been actively discouraged in Python core development. Use of annotations to opt-in to order preservation runs the risk of interfering with other application-level use of annotations. dict.__order__ -------------- dict objects would have a new attribute, `__order__` that would default to None and that in the kwargs case the interpreter would use in the same way as described above for __kworder__. Prognosis: It would mean zero impact on kwargs performance but the change would be pretty intrusive (Python uses dict a lot). Also, for the wrapper case the interpreter would have to be careful to preserve `__order__`. KWArgsDict.__order__ -------------------- This is the same as the `dict.__order__` idea, but kwargs would be an instance of a new minimal dict subclass that provides the `__order__` attribute. dict would instead be unchanged. Prognosis: Simply switching to OrderedDict is a less complicated and more intuitive change. Acknowledgements ================ Thanks to Andrew Barnert for helpful feedback and to the participants of all the past email threads. Footnotes ========= .. [#arg_unpacking] Alternately, you could also replace ** in your function definition with * and then pass in key/value 2-tuples. This has the advantage of not requiring the keys to be valid identifier strings. See https://mail.python.org/pipermail/python-ideas/2014-April/027491.html. References ========== .. [#nick_obvious] https://mail.python.org/pipermail/python-ideas/2014-April/027512.html .. [#past_threads] https://mail.python.org/pipermail/python-ideas/2009-April/004163.html https://mail.python.org/pipermail/python-ideas/2010-October/008445.html https://mail.python.org/pipermail/python-ideas/2011-January/009037.html https://mail.python.org/pipermail/python-ideas/2013-February/019690.html https://mail.python.org/pipermail/python-ideas/2013-May/020727.html https://mail.python.org/pipermail/python-ideas/2014-March/027225.html http://bugs.python.org/issue16276 http://bugs.python.org/issue16553 http://bugs.python.org/issue19026 http://bugs.python.org/issue5397#msg82972 .. [#loss_of_order] https://mail.python.org/pipermail/python-dev/2007-February/071310.html .. [#compact_dict] https://mail.python.org/pipermail/python-dev/2012-December/123028.html https://mail.python.org/pipermail/python-dev/2012-December/123105.html https://mail.python.org/pipermail/python-dev/2013-May/126327.html https://mail.python.org/pipermail/python-dev/2013-May/126328.html .. [#nick_general] https://mail.python.org/pipermail/python-dev/2012-December/123105.html .. [#raymond_debug] https://mail.python.org/pipermail/python-dev/2013-May/126327.html .. [#mock] https://mail.python.org/pipermail/python-ideas/2009-April/004163.html https://mail.python.org/pipermail/python-ideas/2009-April/004165.html https://mail.python.org/pipermail/python-ideas/2009-April/004175.html .. [guido_open] https://mail.python.org/pipermail/python-dev/2013-May/126404.html .. [#c_ordereddict] http://bugs.python.org/issue16991 .. [#ironpython] https://mail.python.org/pipermail/python-dev/2012-December/123100.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: