[Python-3000] PEP: rename it.next() to it.__next__(), add a next() built-in

Ka-Ping Yee python-dev at zesty.ca
Mon Mar 5 11:24:11 CET 2007


At Guido's prompting, I drafted the following PEP.  How does it sound?


-- ?!ng


Title: Renaming iterator.next() to iterator.__next__()
======================================================


Abstract
========

The iterator protocol in Python 2.x consists of two methods:
``__iter__()`` called on an iterable object to yield an iterator, and
``next()`` called on an iterator object to yield the next item in the
sequence.  Using a ``for`` loop to iterate over an iterable object
implicitly calls both of these methods.  This PEP proposes that the
``next`` method be renamed to ``__next__``, consistent with all the
other protocols in Python in which a method is implicitly called as
part of a language-level protocol, and that a built-in function named
``next`` be introduced to invoke ``__next__`` method, consistent with
the manner in which other protocols are explicitly invoked.


Names With Double Underscores
=============================

In Python, double underscores before and after a name are used to
distinguish names that belong to the language itself.  Attributes and
methods that are implicitly used or created by the interpreter employ
this naming convention; some examples are:

*   ``__file__`` - an attribute automatically created by the interpreter

*   ``__dict__`` - an attribute with special meaning to the interpreter

*   ``__init__`` - a method implicitly called by the interpreter

Note that this convention applies to methods such as ``__init__`` that
are explicitly defined by the programmer, as well as attributes such as
``__file__`` that can only be accessed by naming them explicitly, so it
includes names that are used *or* created by the interpreter.

(Not all things that are called "protocols" are made of methods with
double-underscore names.  For example, the ``__contains__`` method has
double underscores because the language construct ``x in y`` implicitly
calls ``__contains__``.  But even though the ``read`` method is part of
the file protocol, it does not have double underscores because there is
no language construct that implicitly invokes ``x.read()``.)

The use of double underscores creates a separate namespace for names
that are part of the Python language definition, so that programmers
are free to create variables, attributes, and methods that start with
letters, without fear of silently colliding with names that have a
language-defined purpose.  (Colliding with reserved keywords is still
a concern, but at least this will immediately yield a syntax error.)

The naming of the ``next`` method on iterators is an exception to
this convention.  Code that nowhere contains an explicit call to a
``next`` method can nonetheless be silently affected by the presence
of such a method.  Therefore, this PEP proposes that iterators should
have a ``__next__`` method instead of a ``next`` method (with no
change in semantics).


Double-Underscore Methods and Built-In Functions
================================================

The Python language defines several protocols that are implemented or
customized by defining methods with double-underscore names.  In each
case, the protocol is provided by an internal method implemented as a
C function in the interpreter.  For objects defined in Python, this
C function supports customization by implicitly invoking a Python method
with a double-underscore name (it often does a little bit of additional
work beyond just calling the Python method.)

Sometimes the protocol is invoked by a syntactic construct:

*   ``x[y]`` --> internal ``tp_getitem`` --> ``x.__getitem__(y)``

*   ``x + y`` --> internal ``nb_add`` --> ``x.__add__(y)``

*   ``-x`` --> internal ``nb_negative`` --> ``x.__neg__()``

Sometimes there is no syntactic construct, but it is still useful to be
able to explicitly invoke the protocol.  For such cases Python offers a
built-in function of the same name but without the double underscores.

*   ``len(x)`` --> internal ``sq_length`` --> ``x.__len__()``

*   ``hash(x)`` --> internal ``tp_hash`` --> ``x.__hash__()``

*   ``iter(x)`` --> internal ``tp_iter`` --> ``x.__iter__()``

Following this pattern, the natural way to handle ``next`` is to add a
``next`` built-in function that behaves in exactly the same fashion.

*   ``next(x)`` --> internal ``tp_iternext`` --> ``x.__next__()``


Previous Proposals
==================

This proposal is not a new idea.  The idea proposed here was supported
by the BDFL on python-dev [1]_ and is even mentioned in the original
iterator PEP, PEP 234::

    (In retrospect, it might have been better to go for __next__()
    and have a new built-in, next(it), which calls it.__next__().
    But alas, it's too late; this has been deployed in Python 2.2
    since December 2001.)


Transition Plan
===============

Two additional transformations will be added to the 2to3 translation
tool [2]_:

*   Method definitions named ``next`` will be renamed to ``__next__``.

*   Explicit calls to the ``next`` method will be replaced with calls
    to the built-in ``next`` function.  For example, ``x.next()`` will
    become ``next(x)``.

If the module being processed already contains a top-level function
definition named ``next``, the second transformation will not be done;
instead, calls to ``x.next()`` will be replaced with ``x.__next__()``
and a warning will be emitted.


References
==========

.. [1] Single- vs. Multi-pass iterability (Guido van Rossum)
   http://mail.python.org/pipermail/python-dev/2002-July/026814.html

.. [2] 2to3 refactoring tool
   http://svn.python.org/view/sandbox/trunk/2to3/


Copyright
=========

This document has been placed in the public domain.


More information about the Python-3000 mailing list