[Python-Dev] Pre-PEP: Allow Empty Subscript List Without Parentheses

Fri Jun 9 17:53:00 CEST 2006

Hello,

Recently I discovered that a small change to the Python grammar that
could help me a lot.

It's simply this: Currently, the expression "x[]" is a syntax error. I
suggest that it will be a valid syntax, and equivalent to "x[()]",
just as "x[a, b]" is equivalent to "x[(a, b)]" right now.

I discussed this in python-list, and Fredrik Lundh suggested that I
quickly write a pre-PEP if I want this to go into 2.5. Since I want
this, I wrote a pre-PEP.

It's available in the wiki, at
http://wiki.python.org/moin/EmptySubscriptListPEP and I also copied it
to this message.

I know that now is really close to 2.5b1, but I thought that perhaps
there was still a chance for this suggestion getting in, since:
 * It's a simple change and there's almost nothing to be decided
except whether to accept it or not.
 * It has a simple implementation (It was fairly easy for me to
implement it, and I know almost nothing about the AST).
 * It causes no backwards compatibilities issues.

Ok, here's the pre-PEP. Please say what you think about it.

Have a good day,
Noam

PEP: XXX
Title: Allow Empty Subscript List Without Parentheses
Version: $Revision$
Last-Modified: $Date$
Author: Noam Raphael <spam.noam at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 09-Jun-2006
Python-Version: 2.5?
Post-History: 30-Aug-2002

Abstract
========

This PEP suggests to allow the use of an empty subscript list, for
example ``x[]``, which is currently a syntax error. It is suggested
that in such a case, an empty tuple will be passed as an argument to
the __getitem__ and __setitem__ methods. This is consistent with the
current behaviour of passing a tuple with n elements to those methods
when a subscript list of length n is used, if it includes a comma.

Specification
=============

The Python grammar specifies that inside the square brackets trailing
an expression, a list of "subscripts", separated by commas, should be
given. If the list consists of a single subscript without a trailing
comma, a single object (an ellipsis, a slice or any other object) is
passed to the resulting __getitem__ or __setitem__ call. If the list
consists of many subscripts, or of a single subscript with a trailing
comma, a tuple is passed to the resulting __getitem__ or __setitem__
call, with an item for each subscript.

Here is the formal definition of the grammar:

::
    trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
    subscriptlist: subscript (',' subscript)* [',']
    subscript: '.' '.' '.' | test | [test] ':' [test] [sliceop]
    sliceop: ':' [test]

This PEP suggests to allow an empty subscript list, with nothing
inside the square brackets. It will result in passing an empty tuple
to the resulting __getitem__ or __setitem__ call.

The change in the grammar is to make "subscriptlist" in the first
quoted line optional:

::
    trailer: '(' [arglist] ')' | '[' [subscriptlist] ']' | '.' NAME

Motivation
==========

This suggestion allows you to refer to zero-dimensional arrays elegantly. In
NumPy, you can have arrays with a different number of dimensions. In
order to refer to a value in a two-dimensional array, you write
``a[i, j]``. In order to refer to a value in a one-dimensional array,
you write ``a[i]``. You can also have a zero-dimensional array, which
holds a single value (a scalar). To refer to its value, you currently
need to write ``a[()]``, which is unexpected - the user may not even
know that when he writes ``a[i, j]`` he constructs a tuple, so he
won't guess the ``a[()]`` syntax. If the suggestion is accepted, the
user will be able to write ``a[]`` in order to refer to the value, as
expected. It will even work without changing the NumPy package at all!

In the normal use of NumPy, you usually don't encounter
zero-dimensional arrays. However, the author of this PEP is designing
another library for managing multi-dimensional arrays of data. Its
purpose is similar to that of a spreadsheet - to analyze data and
preserve the relations between a source of a calculation and its
destination. In such an environment you may have many
multi-dimensional arrays - for example, the sales of several products
over several time periods. But you may also have several
zero-dimensional arrays, that is, single values - for example, the
income tax rate. It is desired that the access to the zero-dimensional
arrays will be consistent with the access to the multi-dimensional
arrays. Just using the name of the zero-dimensional array to obtain
its value isn't going to work - the array and the value it contains
have to be distinguished.

Rationale
=========

Passing an empty tuple to the __getitem__ or __setitem__ call was
chosen because it is consistent with passing a tuple of n elements
when a subscript list of n elements is used. Also, it will make NumPy
and similar packages work as expected for zero-dimensional arrays
without
any changes.

Another hint for consistency: Currently, these equivalences hold:

::
    x[i, j, k]  <-->  x[(i, j, k)]
    x[i, j]     <-->  x[(i, j)]
    x[i, ]      <-->  x[(i, )]
    x[i]        <-->  x[(i)]

If this PEP is accepted, another equivalence will hold:

::
    x[]         <-->  x[()]

Backwards Compatibility
=======================

This change is fully backwards compatible, since it only assigns a
meaning to a previously illegal syntax.

Reference Implementation
========================

Available as SF Patch no. 1503556.
(and also in http://python.pastebin.com/768317 )

It passes the Python test suite, but currently doesn't provide
additional tests or documentation.

Copyright
=========

This document has been placed in the public domain.