[Python-Dev] Python 3000 PEP: Postfix type declarations

Georg Brandl g.brandl at gmx.net
Sun Apr 1 12:14:26 CEST 2007


PEP: XXX
Title: Postfix type declarations
Version: $Revision: $
Last-Modified: $Date: $
Author: Georg Brandl <g.brandl at gmx.net>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 01-Apr-2007
Python-Version: 3.0


Abstract
========

This PEP proposes the addition of a postfix type declaration syntax to
Python. It also specifies a new ``typedef`` statement which is used to create
new mappings between types and declarators.

Its acceptance will greatly enhance the Python user experience as well as
eliminate one of the warts that deter users of other programming languages from
switching to Python.


Rationale
=========

Python has long suffered from the lack of explicit type declarations.  Being one
of the few aspects in which the language deviates from its Zen, this wart has
sparked many a discussion between Python heretics and members of the PSU (for
a few examples, see [EX1]_, [EX2]_ or [EX3]_), and it also made it a large-scale
enterprise success unlikely.

However, if one wants to put an end to this misery, a decent Pythonic syntax
must be found. In almost all languages that have them, type declarations lack
this quality: they are verbose, often needing *multiple words* for a single
type, or they are hard to comprehend (e.g., a certain language uses completely
unrelated [#]_ adjectives like ``dim`` for type declaration).

Therefore, this PEP combines the move to type declarations with another bold
move that will once again prove that Python is not only future-proof but
future-embracing: the introduction of Unicode characters as an integral
constituent of source code.

Unicode makes it possible to express much more with much less characters, which
is in accordance with the Zen ("Readability counts.") [ZEN]_. Additionally, it
eliminates the need for a separate type declaration statement, and last but not
least, it makes Python measure up to Perl 6, which already uses Unicode for its
operators. [#]_


Specification
=============

When the type declaration mode is in operation, the grammar is changed so that
each ``NAME`` must consist of two parts: a name and a type declarator, which is
exactly one Unicode character.

The declarator uniquely specifies the type of the name, and if it occurs on the
left hand side of an expression, this type is enforced: an ``InquisitionError``
exception is raised if the returned type doesn't match the declared type. [#]_

Also, function call result types have to be specified. If the result of the call
does not have the declared type, an ``InquisitionError`` is raised.  Caution: the
declarator for the result should not be confused with the declarator for the
function object (see the example below).

Type declarators after names that are only read, not assigned to, are not strictly
necessary but enforced anyway (see the Python Zen: "Explicit is better than
implicit.").

The mapping between types and declarators is not static. It can be completely
customized by the programmer, but for convenience there are some predefined
mappings for some built-in types:

=========================  ===================================================
Type                       Declarator
=========================  ===================================================
``object``                 � (REPLACEMENT CHARACTER)
``int``                    ℕ (DOUBLE-STRUCK CAPITAL N)
``float``                  ℮ (ESTIMATED SYMBOL)
``bool``                   ✓ (CHECK MARK)
``complex``                ℂ (DOUBLE-STRUCK CAPITAL C)
``str``                    ✎ (LOWER RIGHT PENCIL)
``unicode``                ✒ (BLACK NIB)
``tuple``                  ⒯ (PARENTHESIZED LATIN SMALL LETTER T)
``list``                   ♨ (HOT SPRINGS)
``dict``                   ⧟ (DOUBLE-ENDED MULTIMAP)
``set``                    ∅ (EMPTY SET) (*Note:* this is also for full sets)
``frozenset``              ☃ (SNOWMAN)
``datetime``               ⌚ (WATCH)
``function``               ƛ (LATIN SMALL LETTER LAMBDA WITH STROKE)
``generator``              ⚛ (ATOM SYMBOL)
``Exception``              ⌁ (ELECTRIC ARROW)
=========================  ===================================================

The declarator for the ``None`` type is a zero-width space.

These characters should be obvious and easy to remember and type for every
programmer.


Unicode replacement units
=========================

Since even in our modern, globalized world there are still some old-fashioned
rebels who can't or don't want to use Unicode in their source code, and since
Python is a forgiving language, a fallback is provided for those:

Instead of the single Unicode character, they can type ``name${UNICODE NAME OF
THE DECLARATOR}$``. For example, these two function definitions are equivalent::

     def fooƛ(xℂ):
         return None

and ::

     def foo${LATIN SMALL LETTER LAMBDA WITH STROKE}$(x${DOUBLE-STRUCK CAPITAL C}$):
         return None${ZERO WIDTH NO-BREAK SPACE}$

This is still easy to read and makes the full power of type-annotated Python
available to ASCII believers.


The ``typedef`` statement
=========================

The mapping between types and declarators can be extended with this new statement.

The syntax is as follows::

     typedef_stmt  ::=  "typedef" expr DECLARATOR

where ``expr`` resolves to a type object. For convenience, the ``typedef`` statement
can also be mixed with the ``class`` statement for new classes, like so::

     typedef class Foo☺(object�):
         pass


Example
=======

This is the standard ``os.path.normpath`` function, converted to type declaration
syntax::

     def normpathƛ(path✎)✎:
         """Normalize path, eliminating double slashes, etc."""
         if path✎ == '':
             return '.'
         initial_slashes✓ = path✎.startswithƛ('/')✓
         # POSIX allows one or two initial slashes, but treats three or more
         # as single slash.
         if (initial_slashes✓ and
             path✎.startswithƛ('//')✓ and not path✎.startswithƛ('///')✓)✓:
             initial_slashesℕ = 2
         comps♨ = path✎.splitƛ('/')♨
         new_comps♨ = []♨
         for comp✎ in comps♨:
             if comp✎ in ('', '.')⒯:
                 continue
             if (comp✎ != '..' or (not initial_slashesℕ and not new_comps♨)✓ or
                  (new_comps♨ and new_comps♨[-1]✎ == '..')✓)✓:
                 new_comps♨.appendƛ(comp✎)
             elif new_comps♨:
                 new_comps♨.popƛ()✎
         comps♨ = new_comps♨
         path✎ = '/'.join(comps♨)✎
         if initial_slashesℕ:
             path✎ = '/'*initial_slashesℕ + path✎
         return path✎ or '.'

As you can clearly see, the type declarations add expressiveness, while at the
same time they make the code look much more professional.


Compatibility issues
====================

To enable type declaration mode, one has to write::

     from __future__ import type_declarations

which enables Unicode parsing of the source [#]_, makes ``typedef`` a keyword
and enforces correct types for all assignments and function calls.


References
==========


.. [EX1] http://mail.python.org/pipermail/python-list/2003-June/210588.html

.. [EX2] http://mail.python.org/pipermail/python-list/2000-May/034685.html

.. [EX3] 
http://groups.google.com/group/comp.lang.python/browse_frm/thread/6ae8c6add913635a/de40d4ffe9bd4304?lnk=gst&q=type+declarations&rnum=6

.. [#] Though, if you know the language in question, it may not be *that* unrelated.

.. [ZEN] http://www.python.org/dev/peps/pep-0020/

.. [#] Well, it would, if there was a Perl 6.

.. [#] Since the name ``TypeError`` is already in use, this name has been chosen
    for obvious reasons.

.. [#] The encoding in which the code is written is read from a standard coding
    cookie. There will also be an autodetection mechanism, invoked by ``from
    __future__ import encoding_hell``.


Acknowledgements
================

Many thanks go to Armin Ronacher, Alexander Schremmer and Marek Kubica who helped
find the most suitable and mnemonic declarator for built-in types.

Thanks also to the Unicode Consortium for including all those useful characters
in the Unicode standard.


Copyright
=========

This document has been placed in the public domain.



..
    Local Variables:
    coding: utf-8
    mode: indented-text
    indent-tabs-mode: nil
    sentence-end-double-space: t
    fill-column: 70
    End:



More information about the Python-Dev mailing list