[Python-ideas] Proto-PEP: Preserving the order of **kwargs in a function.

Eric Snow ericsnowcurrently at gmail.com
Sun Apr 6 07:43:15 CEST 2014


Here's the proper proposal I promised.  Hopefully it reflects our
discussions and the past ones too.  My goal here is to focus in on
what I consider to be the most viable approach, which capturing the
gist of the various concerns and alternatives.  There are still a few
small gaps that I will work on as time permits.  Feedback is welcome.

FYI, I also hope to have a basic implementation up in time for the
language summit at PyCon (Wednesday) so that related discussions there
might have something more concrete to support them. :)

-eric

=====================================================================

PEP: XXX
Title: Preserving the order of \*\*kwargs in a function.
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow <ericsnowcurrently at gmail.com>
Discussions-To: python-ideas at python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 5-Apr-2014
Python-Version: 3.5
Post-History:
Resolution:


Abstract
========

The \*\*kwargs syntax in a function definition indicates that the
interpreter should collect all keyword arguments that do not correspond
to other named parameters.  However,
Python does not preserved the order in which those collected keyword
arguments were passed to the function.  In some contexts the order
matters.  This PEP introduces a mechanism by which the passed order of
collected keyword arguments will now be preserved.


Motivation
==========

Python's \*\*kwargs syntax in function definitions provides a powerful
means of dynamically handling keyword arguments.  In some applications
of the syntax (see _`Use Cases`), the semantics applied to the
collected keyword arguments requires that order be preserved.
Unsurprisingly, this is similar to how OrderedDict is related to dict.

Currently to preserved the order you have to do so manually and
separately from the actual function call.  This involves building an
ordered mapping, whether an OrderedDict or an iterable of 2-tuples,
which is then passed as a single argument to the function.
[#arg_unpacking]_

With the capability described in this PEP, that boilerplate would no
longer be required.

For comparision, currently::

   kwargs = OrderedDict()
   kwargs['eggs'] = ...
   ...
   def spam(a, kwargs):
       ...

and with this proposal::

   def spam(a, **kwargs):
       ...

Nick Coglan, speaking of some of the uses cases, summed it up well
[#nick_obvious]_::

   These *can* all be done today, but *not* by using keyword arguments.
   In my view, the problem to be addressed is that keyword arguments
   *look* like they should work for these cases, because they have a
   definite order in the source code. The only reason they don't work
   is because the interpreter throws that ordering information away.

   It's a textbook case of a language feature becoming an attractive
   nuisance in some circumstances: the simple and obvious solution for
   the above use cases *doesn't actually work* for reasons that aren't
   obviously clear if you don't have a firm grasp of Python's admittedly
   complicated argument handling.

This observation is supported by the appearance of this proposal over
the years and the numerous times that people have been confused by the
constructor for OrderedDict. [#past_threads]_ [#loss_of_order]_
[#compact_dict]_


Use Cases
=========

As Nick noted, the current behavior of \*\*kwargs is unintuitive in
cases where one would expect order to matter.  Aside from more specific
cases outlined below, in general "anything else where you want to
control the iteration order *and* set field names and values in a single
call will potentially benefit." [#nick_general]_  That matters in the
case of factories (e.g. __init__()) for ordered types.

Serialization
-------------

Obviously OrderedDict would benefit (both __init__() and update()) from
ordered kwargs.  However, the benefit also extends to serialization
APIs [#nick_obvious]_::

   In the context of serialisation, one key lesson we have learned is
   that arbitrary ordering is a problem when you want to minimise
   spurious diffs, and sorting isn't a simple solution.

   Tools like doctest don't tolerate spurious diffs at all, but are
   often amenable to a sorting based answer.

   The cases where it would be highly desirable to be able use keyword
   arguments to control the order of display of a collection of key
   value pairs are ones like:

   * printing out key:value pairs in CLI output
   * mapping semantic names to column order in a CSV
   * serialising attributes and elements in particular orders in XML
   * serialising map keys in particular orders in human readable formats
     like JSON and YAML (particularly when they're going to be placed
     under source control)

Debugging
---------

In the words of Raymond Hettinger [#raymond_debug]_::

   It makes it easier to debug if the arguments show-up in the order
   they were created.  AFAICT, no purpose is served by scrambling them.

Other Use Cases
---------------

* Mock objects. [#mock]_
* Controlling object presentation.
* Alternate namedtuple() where defaults can be specified.
* Specifying argument priority by order.


Concerns
========

Performance
-----------

As already noted, the idea of ordered keyword arguments has come up on
a number of occasions.  Each time it has been met with the same
response, namely that preserving keyword arg order would have a
sufficiently adverse effect on function call performance that it's not
worth doing.  However, Guido noted the following [#guido_open]_::

  Making **kwds ordered is still open, but requires careful design and
  implementation to avoid slowing down function calls that don't benefit.

As will be noted below, there are ways to work around this at the
expense of increased complication.  Ultimately the simplest approach is
the one that makes the most sense: pack collected key word arguments
into an OrderedDict.  However, without a C implementation of OrderedDict
there isn't much to discuss.  That should change in Python 3.5.
[#c_ordereddict]_

In some cases the difference of performance between dict and OrderedDict
*may* be of significance.  For instance: when the collected
kwargs has an extended lifetime outside the originating function or the
number of collected kwargs is massive.  However, the difference in
performance (both CPU and memory) in those cases should not be
significant.  Furthermore, the performance of the C OrderedDict
implementation is essentially identical with dict for the non-mutating
API.  A concrete representation of the difference in performance will be
a part of this proposal before its resolution.

Other Python Implementations
----------------------------

Another important issue to consider is that new features must be
cognizant of the multiple Python implementations.  At some point each of
them would be expected to have implemented ordered kwargs.  In this
regard there doesn't seem to be an issue with the idea. [#ironpython]_
Each of the major Python implementations will be consulted regarding
this proposal before its resolution.


Specification
=============

Starting in version 3.5 Python will preserve the order of keyword
arguments as passed to a function.  This will apply only to functions
for which the definition uses the \*\*kwargs syntax for collecting
otherwise unspecified keyword arguments.  Only the order of those
keyword arguments will be preserved.

Relationship to **-unpacking syntax
-----------------------------------

The ** unpacking syntax in function calls has no special connection with
this proposal.  Keyword arguments provided by unpacking will be treated
in exactly the same way as they are now: ones that match defined
parameters are gather there and the remainder will be collected into the
ordered kwargs (just like any other unmatched keyword argument).

Note that unpacking a mapping with undefined order, such as dict, will
preserve its iteration order like normal.  It's just that the order will
remain undefined.  The OrderedDict into which the unpacked key-value
pairs will then be packed will not be able to provide any alternate
ordering.  This should not be surprising.

There have been brief discussions of simply passing these mappings
through to the functions kwargs without unpacking and repacking them,
but that is both outside the scope of this proposal and probably a bad
idea regardless.  (There is a reason those discussions were brief.)

Relationship to inspect.Signature
---------------------------------

Signature objects should need no changes.  The `kwargs` parameter of
inspect.BoundArguments (returned by Signature.bind() and
Signature.bind_partial()) will change from a dict to an OrderedDict.

C-API
-----

TBD

Syntax
------

No syntax is added or changed by this proposal.

Backward-Compatibility
----------------------

The following will change:

* type(kwargs)
* iteration order of kwargs will now be consistent (except of course in
  the case described above)
* as already noted, performance will be marginally different

None of these should be an issue.  However, each will be carefully
considered while this proposal is under discussion.


Reference Implementation
========================

TBD

Implementation Notes
--------------------

TBD


Alternate Approaches
====================

Opt-out Decorator
-----------------

This is identical to the current proposal with the exception that Python
would also provide a decorator in functools that would cause collected
keyword arguments to be packed into a normal dict instead of an
OrderedDict.

Prognosis:

This would only be necessary if performance is determined to be
significantly different in some uncommon cases or that there are other
backward-compatibility concerns that cannot be resolved otherwise.

Opt-in Decorator
----------------

The status quo would be unchanged.  Instead Python would provide a
decorator in functools that would register or mark the decorated
function as one that should get ordered keyword arguments.  The
performance overhead to check the function at call time would be
marginal.

Prognosis:

The only real down-side is in the case of function wrappers factories
(e.g.  functools.partial and many decorators) that aim to perfectly
preserve keyword arguments by using kwargs in the wrapper definition
and kwargs unpacking in the call to the wrapped function.  Each wrapper
would have to be updated separately, though having functools.wraps() do
this automaticallywould help.

__kworder__
-----------

The order of keyword arguments would be stored separately in a list at
call time.  The list would be bound to __kworder__ in the function
locals.

Prognosis:

This likewise complicates the wrapper case.

Compact dict with faster iteration
----------------------------------

Raymond Hettinger has introduced the idea of a dict implementation that
would result in preserving insertion order on dicts (until the first
deletion).  This would be a perfect fit for kwargs. [#compact_dict]_

Prognosis:

The idea is still uncertain in both viability and timeframe.

***kwargs
---------

This would add a new form to a function's signature as a mutually
exclusive parallel to \*\*kwargs.  The new syntax, ***kwargs (note that
there are three asterisks), would indicate that kwargs should preserve
the order of keyword arguments.

Prognosis:

New syntax is only added to Python under the most *dire* circumstances.
With other available solutions, new syntax is not justifiable.
Furthermore, like all opt-in solutions, the new syntax would complicate
the pass-through case.

annotations
-----------

This is a variation on the decorator approach.  Instead of using a
decorator to mark the function, you would use a function annotation on
\*\*kwargs.

Prognosis:

In addition to the pass-through complication, annotations have been
actively discouraged in Python core development.  Use of annotations to
opt-in to order preservation runs the risk of interfering with other
application-level use of annotations.

dict.__order__
--------------

dict objects would have a new attribute, `__order__` that would default
to None and that in the kwargs case the interpreter would use in the
same way as described above for __kworder__.

Prognosis:

It would mean zero impact on kwargs performance but the change would be
pretty intrusive (Python uses dict a lot).  Also, for the wrapper case
the interpreter would have to be careful to preserve `__order__`.

KWArgsDict.__order__
--------------------

This is the same as the `dict.__order__` idea, but kwargs would be an
instance of a new minimal dict subclass that provides the `__order__`
attribute.  dict would instead be unchanged.

Prognosis:

Simply switching to OrderedDict is a less complicated and more intuitive
change.


Acknowledgements
================

Thanks to Andrew Barnert for helpful feedback and to the participants of
all the past email threads.


Footnotes
=========

.. [#arg_unpacking]

   Alternately, you could also replace ** in your function definition
   with * and then pass in key/value 2-tuples.  This has the advantage
   of not requiring the keys to be valid identifier strings. See
   https://mail.python.org/pipermail/python-ideas/2014-April/027491.html.


References
==========

.. [#nick_obvious]
   https://mail.python.org/pipermail/python-ideas/2014-April/027512.html

.. [#past_threads]
   https://mail.python.org/pipermail/python-ideas/2009-April/004163.html
   https://mail.python.org/pipermail/python-ideas/2010-October/008445.html
   https://mail.python.org/pipermail/python-ideas/2011-January/009037.html
   https://mail.python.org/pipermail/python-ideas/2013-February/019690.html
   https://mail.python.org/pipermail/python-ideas/2013-May/020727.html
   https://mail.python.org/pipermail/python-ideas/2014-March/027225.html
   http://bugs.python.org/issue16276
   http://bugs.python.org/issue16553
   http://bugs.python.org/issue19026
   http://bugs.python.org/issue5397#msg82972

.. [#loss_of_order]
   https://mail.python.org/pipermail/python-dev/2007-February/071310.html

.. [#compact_dict]
   https://mail.python.org/pipermail/python-dev/2012-December/123028.html
     https://mail.python.org/pipermail/python-dev/2012-December/123105.html
   https://mail.python.org/pipermail/python-dev/2013-May/126327.html
     https://mail.python.org/pipermail/python-dev/2013-May/126328.html

.. [#nick_general]
   https://mail.python.org/pipermail/python-dev/2012-December/123105.html

.. [#raymond_debug]
   https://mail.python.org/pipermail/python-dev/2013-May/126327.html

.. [#mock]
   https://mail.python.org/pipermail/python-ideas/2009-April/004163.html
     https://mail.python.org/pipermail/python-ideas/2009-April/004165.html
     https://mail.python.org/pipermail/python-ideas/2009-April/004175.html

.. [guido_open]
   https://mail.python.org/pipermail/python-dev/2013-May/126404.html

.. [#c_ordereddict]
   http://bugs.python.org/issue16991

.. [#ironpython]
   https://mail.python.org/pipermail/python-dev/2012-December/123100.html


Copyright
=========

This document has been placed in the public domain.

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


More information about the Python-ideas mailing list