[Python-ideas] List assignment - extended slicing inconsistency

Thu Feb 22 20:51:09 EST 2018

On Thu, Feb 22, 2018 at 2:18 PM, Alexander Heger <python at 2sn.net> wrote:

> What little documentation I could find, providing a stride on the
> assignment target for a list is supposed to trigger 'advanced slicing'
> causing element-wise replacement - and hence requiring that the source
> iterable has the appropriate number of elements.
>
> >>> a = [0,1,2,3]
> >>> a[::2] = [4,5]
> >>> a
> [4, 1, 5, 3]
> >>> a[::2] = [4,5,6]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: attempt to assign sequence of size 3 to extended slice of size
> 2
>
> This is in contrast to regular slicing (*without* a stride), allowing to
> replace a *range* by another sequence of arbitrary length.
>
> >>> a = [0,1,2,3]
> >>> a[:3] = [4]
> >>> a
> [4, 3]
>
> Issue
> =====
> When, however, a stride of `1` is specified, advanced slicing is not
> triggered.
>
> >>> a = [0,1,2,3]
> >>> a[:3:1] = [4]
> >>> a
> [4, 3]
>
> If advanced slicing had been triggered, there should have been a
> ValueError instead.
>
> Expected behaviour:
>
> >>> a = [0,1,2,3]
> >>> a[:3:1] = [4]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: attempt to assign sequence of size 1 to extended slice of size
> 3
>
> I think that is an inconsistency in the language that should be fixed.
>
> Why do we need this?
> ====================
> One may want this as extra check as well so that list does not change
> size.  Depending on implementation, it may come with performance benefits
> as well.
>
> One could, though, argue that you still get the same result if you do all
> correctly
>
> >>> a = [0,1,2,3]
> >>> a[:3:1] = [4,5,6]
> >>> a
> [4, 5, 6, 3]
>
> But I disagree that there should be no error when it is wrong.
> *Strides that are not None should always trigger advanced slicing.*
>

This makes sense.

(I wonder if the discrepancy is due to some internal interface that loses
the distinction between None and 1 before the decision is made whether to
use advanced slicing or not. But that's a possible explanation, not an
excuse.)

> Other Data Types
> ================
> This change should also be applied to bytearray, etc., though see below.
>

Sure.

> Concerns
> ========
> It may break some code that uses advanced slicing and expects regular
> slicing to occur?  These cases should be rare, and the error message should
> be clear enough to allow fixes? I assume these cases should be
> exceptionally rare.
>

Yeah, backwards compatibility sometimes prevents fixing a design bug. I
don't know if that's the case here, we'll need reports from real-world code.

> If the implementation relies on `slice.indices(len(seq))[2] == 1` to
> determine about advance slicing or not, that would require some
> refactoring.  If it is only `slice.stride in (1, None)` then this could
> easily replaced by checking against None.
>
> Will there be issues with syntax consistency with other data types, in
> particular outside the core library?
>

Things outside the stdlib are responsible for their own behavior. Usually
they can move faster and with less worry about breaking backward
compatibility.

>
> - I always found that the dynamic behaviour of lists w/r non-advanced
> slicing to be somewhat peculiar in the first place, though, undeniably, it
> can be immensely useful.
>

If you're talking about the ability to resize a list by assigning to a
slice, that's as intended. It predates advanced slicing by a decade or more.

> - Most external data types with fixed memory such as numpy do not have
> this dynamic flexibility, and the behavior of regular slicing on assignment
> is the same as regular slicing.  The proposed change would increase
> consistency with these other data types.
>

How? Resizing through slice assignment will stay for builtin types -- if
numpy doesn't support that, so be it.

> More surprises
> ==============
> >>> import array
> >>> a[1::2] = a[3:3]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: attempt to assign sequence of size 0 to extended slice of size
> 2
>
> whereas
>
> >>> a = [1,2,3,4,5]
> >>> a[1::2] = a[3:3]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: attempt to assign sequence of size 0 to extended slice of size
> 2
>

OK, so array doesn't use the same rules. That should be fixed too probably
(assuming whatever is valid today remains valid).

> >>> a = bytearray(b'12345')
> >>> a[1::2] = a[3:3]
> >>> a
> bytearray(b'135')
>

Bytearray should also follow the same rules.

> but numpy
>
> >>> import numpy as np
> >>> a = np.array([1,2,3,4,5])
> >>> a[1::2] = a[3:3]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: could not broadcast input array from shape (0) into shape (2)
>
> and
>
> >>> import numpy as np
> >>> a[1:2] = a[3:3]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: could not broadcast input array from shape (0) into shape (1)
>
> The latter two as expected.  memoryview behaves the same.
>

Let's leave numpy out of this discussion. And memoryview is a special case
because it can't change size (it provides a view into an inflexible
structure).

> Issue 2
> =======
> Whereas NumPy is know to behave differently as a data type with fixed
> memory layout, and is not part of the standard library anyway, the
> difference in behaviour between lists and arrays I find disconcerting.
> This should be resolved to a consistent behaviour.
>
> Proposal 2
> ==========
> Arrays and bytearrays should should adopt the same advanced slicing
> behaviour I suggest for lists.
>

Sure.

> Concerns 2
> ==========
> This has the potential for a lot more side effects in existing code, but
> as before in most cases error message should be triggered.
>

Side effects? No code that currently doesn't raise will break, right?

> Summary
> =======
> I find it it not acceptable as a good language design that there is a
> large range of behaviour  on slicing in assignment target for the different
> native (and standard library) data type of seemingly similar kind, and that
> users have to figure out for each data type by testing - or at the very
> least remember if documented - how it behaves on slicing in assignment
> targets.  There should be a consistent behaviour at the very least, ideally
> even one with a clear user interface as suggested for lists.
>

Fortunately, what *you* find acceptable or good language design is not all
that important. (You can avoid making any mistakes in your own language.
:-) You may by now realize that 100% consistent behavior is hard to obtain.
However we'll gladly consider your feedback.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180222/e022552d/attachment-0001.html>