[Python-3000] PEP 3113 (Removal of Tuple Parameter Unpacking)
Brett Cannon
brett at python.org
Sat Mar 3 01:56:59 CET 2007
Those of you who attended PyCon 2007 probably know what PEP 3113 is
all about: the removal of the automatic unpacking of sequence
arguments when a tuple parameter is present in a function's signature
(e.g., the ``(b, c)`` in ``def fxn(a, (b, c), d): pass``). Thanks to
everyone who helped provide arguments for as to why they need to go
(and even Ping who said he likes these things =).
The PEP has been committed so those of you who prefer reading PEPs
through a browser should be able to in about an hour.
-------------------------------------------------
Abstract
========
Tuple parameter unpacking is the use of a tuple as a parameter in a
function signature so as to have a sequence argument automatically
unpacked. An example is::
def fxn(a, (b, c), d):
pass
The use of ``(b, c)`` in the signature requires that the second
argument to the function be a sequence of length two (e.g.,
``[42, -13]``). When such a sequence is passed it is unpacked and
has its values assigned to the parameters, just as if the statement
``b, c = [42, -13]`` had been executed in the parameter.
Unfortunately this feature of Python's rich function signature
abilities, while handy in some situations, causes more issues than
they are worth. Thus this PEP proposes their removal from the
language in Python 3.0.
Why They Should Go
==================
Introspection Issues
--------------------
Python has very powerful introspection capabilities. These extend to
function signatures. There are no hidden details as to what a
function's call signature is. In general it is fairly easy to figure
out various details about a function's signature by viewing the
function object and various attributes on it (including the function's
``func_code`` attribute).
But there is great difficulty when it comes to tuple parameters. The
existence of a tuple parameter is denoted by it name being made of a
``.`` and a number in the ``co_varnames`` attribute of the function's
code object. This allows the tuple argument to be bound to a name
that only the bytecode is aware of and cannot be typed in Python
source. But this does not specify the format of the tuple: its
length, whether there are nested tuples, etc.
In order to get all of the details about the tuple from the function
one must analyse the bytecode of the function. This is because the
first bytecode in the function literally translates into the tuple
argument being unpacked. Assuming the tuple parameter is
named ``.1`` and is expected to unpack to variables ``spam`` and
``monty`` (meaning it is the tuple ``(spam, monty)``), the first
bytecode in the function will be for the statement
``spam, monty = .1``. This means that to know all of the details of
the tuple parameter one must look at the initial bytecode of the
function to detect tuple unpacking for parameters formatted as
``\.\d+`` and deduce any and all information about the expected
argument. Bytecode analysis is how the ``inspect.getargspec``
function is able to provide information on tuple parameters. This is
not easy to do and is burdensome on introspection tools as they must
know how Python bytecode works (an otherwise unneeded burden as all
other types of parameters do not require knowledge of Python
bytecode).
The difficulty of analysing bytecode not withstanding, there is
another issue with the dependency on using Python bytecode.
IronPython [#ironpython]_ does not use Python's bytecode. Because it
is based on the .NET framework it instead stores MSIL [#MSIL]_ in
``func_code.co_code`` attribute of the function. This fact prevents
the ``inspect.getargspec`` function from working when run under
IronPython. It is unknown whether other Python implementations are
affected but is reasonable to assume if the implementation is not just
a re-implementation of the Python virtual machine.
No Loss of Abilities If Removed
-------------------------------
As mentioned in `Introspection Issues`_, to handle tuple parameters
the function's bytecode starts with the bytecode required to unpack
the argument into the proper parameter names. This means that their
is no special support required to implement tuple parameters and thus
there is no loss of abilities if they were to be removed, only a
possible convenience (which is addressed in
`Why They Should (Supposedly) Stay`_).
The example function at the beginning of this PEP could easily be
rewritten as::
def fxn(a, b_c, d):
b, c = b_c
pass
and in no way lose functionality.
Exception To The Rule
---------------------
When looking at the various types of parameters that a Python function
can have, one will notice that tuple parameters tend to be an
exception rather than the rule.
Consider PEP 3102 (keyword-only arguments) and PEP 3107 (function
annotations) [#pep-3102]_ [#pep-3107]_. Both PEPs have been accepted and
introduce new functionality within a function's signature. And yet
for both PEPs the new feature cannot be applied to tuple parameters.
This leads to tuple parameters not being on the same footing as
regular parameters that do not attempt to unpack their arguments.
The existence of tuple parameters also places sequence objects
separately from mapping objects in a function signature. There is no
way to pass in a mapping object (e.g., a dict) as a parameter and have
it unpack in the same fashion as a sequence does into a tuple
parameter.
Uninformative Error Messages
----------------------------
Consider the following function::
def fxn((a, b), (c, d)):
pass
If called as ``fxn(1, (2, 3))`` one is given the error message
``TypeError: unpack non-sequence``. This error message in no way
tells you which tuple was not unpacked properly. There is also no
indication that this was a result that occurred because of the
arguments. Other error messages regarding arguments to functions
explicitly state its relation to the signature:
``TypeError: fxn() takes exactly 2 arguments (0 given)``, etc.
Little Usage
------------
While an informal poll of the handful of Python programmers I know
personally and from the PyCon 2007 sprint indicates a huge majority of
people do not know of this feature and the rest just do not use it,
some hard numbers is needed to back up the claim that the feature is
not heavily used.
Iterating over every line in Python's code repository in the ``Lib/``
directory using the regular expression ``^\s*def\s*\w+\s*\(`` to
detect function and method definitions there were 22,252 matches in
the trunk.
Tacking on ``.*,\s*\(`` to find ``def`` statements that contained a
tuple parameter, only 41 matches were found. This means that for
``def`` statements, only 0.18% of them seem to use a tuple parameter.
Why They Should (Supposedly) Stay
=================================
Practical Use
-------------
In certain instances tuple parameters can be useful. A common example
is code that expect a two-item tuple that reperesents a Cartesian
point. While true it is nice to be able to have the unpacking of the
x and y coordinates for you, the argument is that this small amount of
practical usefulness is heavily outweighed by other issues pertaining
to tuple parameters. And as shown in
`No Loss Of Abilities If Removed`_, their use is purely practical and
in no way provide a unique ability that cannot be handled in other
ways very easily.
Self-Documentation For Parameters
---------------------------------
It has been argued that tuple parameters provide a way of
self-documentation for parameters that are expected to be of a certain
sequence format. Using our Cartesian point example from
`Practical Use`_, seeing ``(x, y)`` as a parameter in a function makes
it obvious that a tuple of length two is expected as an argument for
that parameter.
But Python provides several other ways to document what parameters are
for. Documentation strings are meant to provide enough information
needed to explain what arguments are expected. Tuple parameters might
tell you the expected length of a sequence argument, it does not tell
you what that data will be used for. One must also read the docstring
to know what other arguments are expected if not all parameters are
tuple parameters.
Function annotations (which do not work with tuple parameters) can
also supply documentation. Because annotations can be of any form,
what was once a tuple parameter can be a single argument parameter
with an annotation of ``tuple``, ``tuple(2)``, ``Cartesian point``,
``(x, y)``, etc. Annotations provide great flexibility for
documenting what an argument is expected to be for a parameter,
including being a sequence of a certain length.
Transition Plan
===============
To transition Python 2.x code to 3.x where tuple parameters are
removed, two steps are suggested. First, the proper warning is to be
emitted when Python's compiler comes across a tuple parameter in
Python 2.6. This will be treated like any other syntactic change that
is to occur in Python 3.0 compared to Python 2.6.
Second, the 2to3 refactoring tool [#2to3]_ will gain a rule for
translating tuple parameters to being a single parameter this is
unpacked as the first statement in the function. The name of the new
parameter will be a mangling of tuple parameter's names by joining
them with underscores. The new parameter will then be unpacked into
the names originally used in the tuple parameter. This means that the
following function::
def fxn((a, (b, c))):
pass
will be translated into::
def fxn(a_b_c):
(a, (b, c)) = a_b_c
pass
References
==========
.. [#2to3] 2to3 refactoring tool
(http://svn.python.org/view/sandbox/trunk/2to3/)
.. [#ironpython] IronPython
(http://www.codeplex.com/Wiki/View.aspx?ProjectName=IronPython)
.. [#MSIL] Microsoft Intermediate Language
(http://msdn.microsoft.com/library/en-us/cpguide/html/cpconmicrosoftintermediatelanguagemsil.asp?frame=true)
.. [#pep-3102] PEP 3102 (Keyword-Only Arguments)
(http://www.python.org/dev/peps/pep-3102/)
.. [#pep-3107] PEP 3107 (Function Annotations)
(http://www.python.org/dev/peps/pep-3107/)
More information about the Python-3000
mailing list