[Python-3000] Removing __getslice__ et al.

Guido van Rossum guido at python.org
Sun Apr 23 10:09:58 CEST 2006


On 4/23/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Thomas Wouters wrote:
> >
> > A long-standing part of Py3K is removing 'old style slices' (meaning the
> > __get/set/delslice__ methods and the sq_slice/sq_ass_slice sequence-struct
> > functions, but not slice objects.) I started removing them, only to find
> > out that it's damned inconvenient to make all slicing go through
> > tp_as_mapping->mp_subscript, for two reasons:
> >
> > - Many classes don't behave as mapping at all, so the mp_subscript function
> > (and the whole tp_as_mapping struct) would be added just for the benefit of
> > accepting slices, which is really a sequence thing, not a mapping thing.

Well, we already did that for lists, strings and other built-in
sequences that support extended slicing (remember "hello"[::-1] ?:-).
So I think it's no fundamental problem, just an inconvenience.

> These types already provide tp_as_sequence->sq_item. Why would they want to
> provide tp_as_mapping->mp_subscript as well?

Ah, you've never tried to implement extended slicing. A[x:y:z]
generates bytecode that calls A[slice(x,y,z)] and since a slice object
is not an int the GETITEM opcode tries to pass that to the
mapping/dict getitem implementation. Same for A[x:y, z:u] -- it
generates code to call A[(slice(x,y), slice(z,y))] i.e. the argument
is a tuple of slices. Etc.; the tuple can also contain an Ellipsis
object if you wrote A[x, ..., z]. (Yes that's valid syntax!)

> > - There's actually a PyMapping_Check that relies on sq_slice: a type is a
> > mapping type when it has tp_as_mapping->mp_subscript but not
> > tp_as_sequence->sq_slice. I'm not sure how to express that if there is no
> > special method for slicing.
>
> How about changing it to check tp_as_sequence->sq_item instead? (that will
> still work for dictionaries - they only define tp_as_sequence because there
> isn't a slot to hook "__contains__" in the tp_as_mapping structure)

The problem with that is that if a Python class defines __getitem__
the C code doesn't know whether it intends to implement the mapping or
the sequence protocol, so it fills in both slots.

I believe that PyMapping_Check should be gotten rid of, and also any
other check, whether in C or in Python, that attempts to distinguish
between mappings and sequences. These checks are all vestiges of
Python development before user-defined classes were added to the
language in late 1990...

Ideally (but this is a big refactoring of the C API!) the C API would
change so that there are no redundant slots -- no sequence
concat/repeat vs. numeric add/mul, and no sequence vs. numeric
getitem/setitem/len/anything else.

> On a slightly different note, we should fix the definition of sequence vs
> mapping so that PySequence_Check and PyMapping_Check work for arbitrary Python
> classes.

They can't. See above. They should be gotten rid of. (Or they should
be used simply as checks whether the AsSequence or AsMapping struct
pointer is non-NULL - but that's not very useful and the name would be
misleading.)

> For example, a flag "__sequence__ = True" that type() checked when
> constructing the class. If the flag was missing, both
> tp_as_mapping->mp_subscript and tp_as_sequence->sq_item would be filled in
> with __getitem__ (as they are now). If the flag was present and false, only
> tp_as_mapping would be filled in. If the flag was present and true, only
> tp_as_sequence would be filled in.

I think this is the wrong approach. Python overloads A[x] so that it
could mean either sequence indexing or mapping lookup. That's a
fundamental syntactic ambiguity. We should not attempt to write code
that depends on whether the object implements one or the other.

> > (While on the subject, howmuch of the C API dealing with slices should be
> > adjusted? There's PySequence_GetSlice() that takes Py_ssize_t's, which is
> > (IMHO) pretty convenient for C code, but it'd have to create a slice object
> > behind the scenes. And maybe creating a slice object manually isn't that
> > much effort, after all.)
>
> I think we should leave abstract.h alone

I think abstract.[ch] and object.[ch] ought to be merged -- there is
no rhyme or reason to whether any particular function is implemented
in one or the other.

Also there's a lot of redundancy -- there are PySequence_XXX,
PyMapping_XXX, PyNumber_XXX and PyObject_XXX functions often with
overlapping but non-identical APIs. This is a confusing mess and
should be sorted out significantly.

(What about 3rd party extensions? Well too bad. We've got to break
this stuff some day.)

> - Getting rid of __getslice__ at the
> Python level is a good thing, but simply fixing these API functions to "do the
> right thing" is easier than breaking all the code that currently uses them.

Perhaps we could drop the __getslice__ Python API but add new and
different support for slicing in the C API. One of the major problems
with the current slice API (C and Python) is that it is formulated in
terms of ints, *and* that those ints have already been corrected for
negative indices. I.e. if you call A[-1:-2] and len(A) == 5, the
getslice operation (C or Python) is called with arguments (4, 3).
That's a major pain. Extended slicing doesn't do this.

> Should we add a convenience function PySlice_FromIndices to the slice API that
> accepts Py_ssize_t's rather than PyObjects?

Makes sense, but only if there are more than two places in the current
code base where it would be used.

So, as an overall response to Thomas, if you can get rid of
PyMapping_Check, I think you can steam ahead with this one way or
another. A special sq_newslice slot that gets called when the getitem
arg is a slice is probably not the way to go. Rather, convenience
routines to extract the indices as Py_ssize_t values would make the
most sense. These can safely be clipped to +/- maxint when the real
value is too large, I believe (I think that convenience routine may
already exist after Travis Oliphant's patches).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list