slice.literal notation

I was told in the thread that it might be a good idea to bring this up on python discussions. Here is a link to the proposed patch and some existing comments: http://bugs.python.org/issue24379 I often find that when working with pandas and numpy I want to store slice objects in variables to pass around and re-use; however, the syntax for constructing a slice literal outside of an indexer is very different from the syntax used inside of a subscript. This patch proposes the following change: slice.literal This would be a singleton instance of a class that looks like: class sliceliteral(object): def __getitem__(self, key): return key The basic idea is to provide an alternative constructor to 'slice' that uses the subscript syntax. This allows people to write more understandable code. Consider the following examples: reverse = slice(None, None, -1) reverse = slice.literal[::-1] all_rows_first_col = slice(None), slice(0) all_rows_first_col = slice.literal[:, 0] first_row_all_cols_but_last = slice(0), slice(None, -1) first_row_all_cols_but_last = slice.literal[0, :-1] Again, this is not intended to make the code shorter, instead, it is designed to make it more clear what the slice object your are constructing looks like. Another feature of the new `literal` object is that it is not limited to just the creation of `slice` instances; instead, it is designed to mix slices and other types together. For example:
These examples show that sometimes the subscript notation is much more clear that the non-subscript notation. I believe that while this is trivial, it is very convinient to have on the slice type itself so that it is quickly available. This also prevents everyone from rolling their own version that is accesible in different ways (think Py_RETURN_NONE). Another reason that chose this aproach is that it requires no change to the syntax to support. There is a second change proposed here and that is to 'slice.__repr__'. This change makes the repr of a slice object match the new literal syntax to make it easier to read.
This change actually affects old behaviour so I am going to upload it as a seperate patch. I understand that the change to repr much be less desirable than the addition of 'slice.literal'

On Wed, Jun 10, 2015 at 6:33 PM, Joseph Jevnik <joejev@gmail.com> wrote:
In regard with the first suggestion, this has already been mentioned on the tracker but is important enough to repeat here: This already exists in NumPy as IndexExpression, used via numpy.S_ or numpy.index_exp. For details, see: http://docs.scipy.org/doc/numpy/reference/generated/numpy.s_.html - Tal Einat

On Wed, 10 Jun 2015 12:23:32 -0400 Joseph Jevnik <joejev@gmail.com> wrote:
I am not sure if this makes sense in the ast module only because it does not generate _ast.Slice objects and instead returns the keys.
There's already ast.literal_eval() there, so that was why I thought it could be related. Thought at literal_eval() *compiles* its input, which slice.literal wouldn't, so the relationship is quite distant... Regards Antoine.

I considered `slice[...]` however, this will change some existing behaviour. This would mean we need to put a metaclass on slice, and then `type(slice) is type` would no longer be true. Also, with 3.5's typing work, we are overloading the meaning of indexing a type object. Adding the slice.literal does not break anything or conflict with any syntax. On Wed, Jun 10, 2015 at 12:03 PM, <random832@fastmail.us> wrote:

+1 This is an elegant improvement that doesn't affect backward compatibility. Obviously, the difference between the spelling 'sliceliteral[::-1]' and 'slice.literal[::-1]' isn't that big, but having it attached to the slice type itself rather than a user class feels more natural. On Wed, Jun 10, 2015 at 8:33 AM, Joseph Jevnik <joejev@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Wed, Jun 10, 2015 at 10:05 PM, David Mertz <mertz@gnosis.cx> wrote:
I dislike adding this to the slice class since many use cases don't result in a slice at all. For example: [0] -> int [...] -> Ellipsis [0:1, 2:3] -> 2-tuple of slice object I like NumPy's name of IndexExpression, perhaps we can stick to that? As for where it would reside, some possibilities are: * the operator module * as part of the collections.abc.Sequence abstract base class * the types module * builtins - Tal

On 11 June 2015 at 06:21, Tal Einat <taleinat@gmail.com> wrote:
I'm with Tal here - I like the concept, don't like the spelling because it may return things other than slice objects. While the formal name of the operation denoted by trailing square brackets "[]" is "subscript" (with indexing and slicing being only two of its several use cases), the actual *protocol* involved in implementing that operation is getitem/setitem/delitem, so using the formal name would count as "non-obvious" in my view. Accordingly, I'd suggest putting this in under the name "operator.itemkey" (no underscore because the operator module traditionally omits them). zero = operator.itemkey[0] ellipsis = operator.itemkey[...] reverse = slice(None, None, -1) reverse = operator.itemkey[::-1] all_rows_first_col = slice(None), slice(0) all_rows_first_col = operator.itemkey[:, 0] first_row_all_cols_but_last = slice(0), slice(None, -1) first_row_all_cols_but_last = operator.itemkey[0, :-1] Documentation would say that indexing into this object produces the result of the key transformation step of getitem/setitem/delitem Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jun 10, 2015, at 18:54, Nick Coghlan <ncoghlan@gmail.com> wrote:
That name seems a little odd. Normally by "key", you mean the thing you subscript a mapping with, as opposed to an index, the thing you subscript a sequence with (either specifically an integer, or the broader sense of an integer, a slice, an ellipsis, or a tuple of indices recursively). (Of course you _can_ use this with a mapping key, but then it just returns the same key you passed in, which isn't very useful, except in allowing generic code that doesn't know whether it has a key or an index and wants to pass it on to a mapping or sequence, which obviously isn't the main use here.) "itemindex" avoids the main problem with "itemkey", but it still shares the secondary problem of burying the fact that this is about slices (and tuples of plain indices and slices), not just (or even primarily) plain indices. I agree with you that "subscript" isn't a very good name either. I guess "lookup" is another possibility, and it parallels "LookupError" being the common base class of "IndexError" and "KeyError", but that sounds even less meaningful than "subscript" to me. So, I don't have a good name to offer. One last thing: Would it be worth adding bracket syntax to itemgetter, to make it easier to create slicing functions? (That wouldn't remove the need for this function, or vice versa, but since we're in operator and adding a thing that gets "called" with brackets...)

On Thu, Jun 11, 2015 at 11:38 AM, Andrew Barnert <abarnert@yahoo.com> wrote:
I actually think "subscript" is quite good a name. It makes the explicit distinction between subscripts, indexes and slices. As for itemgetter, with X (placeholder for name we choose), you would just do itemgetter(X[::-1]), so I don't see a need to change itemgetter. - Tal

On 11 June 2015 at 22:57, Tal Einat <taleinat@gmail.com> wrote:
I actually think "subscript" is quite good a name. It makes the explicit distinction between subscripts, indexes and slices.
Yeah, I've warmed to it myself: zero = operator.subscript[0] ellipsis = operator.subscript[...] reverse = slice(None, None, -1) reverse = operator.subscript[::-1] all_rows_first_col = slice(None), slice(0) all_rows_first_col = operator.subscript[:, 0] first_row_all_cols_but_last = slice(0), slice(None, -1) first_row_all_cols_but_last = operator.subscript[0, :-1] I realised the essential problem with using "item" in the name is that the "item" in the method names refers to the *result*, not to the input. Since the unifying term for the different kinds of input is indeed "subscript" (covering indices, slices, multi-dimensional slices, key lookups, content addressable data structures, etc), it makes sense to just use it rather than inventing something new. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 06/10/2015 08:33 AM, Joseph Jevnik wrote:
The basic idea is to provide an alternative constructor to 'slice' that uses the subscript syntax. This allows people to write more understandable code.
+1
-1 Having the old repr makes it possible to see what the equivalent slice() spelling is. -- ~Ethan~

Ethan, I am also not 100% on the new repr, I just wanted to propose this change. In the issue, I have separated that change into it's own patch to make it easier to apply the slice.literal without the repr update. On Wed, Jun 10, 2015 at 3:16 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

On 6/10/2015 11:33 AM, Joseph Jevnik wrote:
Alternate constructors are implemented as class methods. class slice: ... @classmethod def literal(cls, key): if isinstance(key, cls): return key else: else raise ValueError('slice literal mush be slice') They are typically names fromxyz or from_xyz. Tal Einat pointed out that not all keys are slices
I think the first two cases should value errors. The third might be debated, but if allowed, this would not be a slice constructor. -- Terry Jan Reedy

On Wed, Jun 10, 2015 at 6:33 PM, Joseph Jevnik <joejev@gmail.com> wrote:
In regard with the first suggestion, this has already been mentioned on the tracker but is important enough to repeat here: This already exists in NumPy as IndexExpression, used via numpy.S_ or numpy.index_exp. For details, see: http://docs.scipy.org/doc/numpy/reference/generated/numpy.s_.html - Tal Einat

On Wed, 10 Jun 2015 12:23:32 -0400 Joseph Jevnik <joejev@gmail.com> wrote:
I am not sure if this makes sense in the ast module only because it does not generate _ast.Slice objects and instead returns the keys.
There's already ast.literal_eval() there, so that was why I thought it could be related. Thought at literal_eval() *compiles* its input, which slice.literal wouldn't, so the relationship is quite distant... Regards Antoine.

I considered `slice[...]` however, this will change some existing behaviour. This would mean we need to put a metaclass on slice, and then `type(slice) is type` would no longer be true. Also, with 3.5's typing work, we are overloading the meaning of indexing a type object. Adding the slice.literal does not break anything or conflict with any syntax. On Wed, Jun 10, 2015 at 12:03 PM, <random832@fastmail.us> wrote:

+1 This is an elegant improvement that doesn't affect backward compatibility. Obviously, the difference between the spelling 'sliceliteral[::-1]' and 'slice.literal[::-1]' isn't that big, but having it attached to the slice type itself rather than a user class feels more natural. On Wed, Jun 10, 2015 at 8:33 AM, Joseph Jevnik <joejev@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Wed, Jun 10, 2015 at 10:05 PM, David Mertz <mertz@gnosis.cx> wrote:
I dislike adding this to the slice class since many use cases don't result in a slice at all. For example: [0] -> int [...] -> Ellipsis [0:1, 2:3] -> 2-tuple of slice object I like NumPy's name of IndexExpression, perhaps we can stick to that? As for where it would reside, some possibilities are: * the operator module * as part of the collections.abc.Sequence abstract base class * the types module * builtins - Tal

On 11 June 2015 at 06:21, Tal Einat <taleinat@gmail.com> wrote:
I'm with Tal here - I like the concept, don't like the spelling because it may return things other than slice objects. While the formal name of the operation denoted by trailing square brackets "[]" is "subscript" (with indexing and slicing being only two of its several use cases), the actual *protocol* involved in implementing that operation is getitem/setitem/delitem, so using the formal name would count as "non-obvious" in my view. Accordingly, I'd suggest putting this in under the name "operator.itemkey" (no underscore because the operator module traditionally omits them). zero = operator.itemkey[0] ellipsis = operator.itemkey[...] reverse = slice(None, None, -1) reverse = operator.itemkey[::-1] all_rows_first_col = slice(None), slice(0) all_rows_first_col = operator.itemkey[:, 0] first_row_all_cols_but_last = slice(0), slice(None, -1) first_row_all_cols_but_last = operator.itemkey[0, :-1] Documentation would say that indexing into this object produces the result of the key transformation step of getitem/setitem/delitem Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jun 10, 2015, at 18:54, Nick Coghlan <ncoghlan@gmail.com> wrote:
That name seems a little odd. Normally by "key", you mean the thing you subscript a mapping with, as opposed to an index, the thing you subscript a sequence with (either specifically an integer, or the broader sense of an integer, a slice, an ellipsis, or a tuple of indices recursively). (Of course you _can_ use this with a mapping key, but then it just returns the same key you passed in, which isn't very useful, except in allowing generic code that doesn't know whether it has a key or an index and wants to pass it on to a mapping or sequence, which obviously isn't the main use here.) "itemindex" avoids the main problem with "itemkey", but it still shares the secondary problem of burying the fact that this is about slices (and tuples of plain indices and slices), not just (or even primarily) plain indices. I agree with you that "subscript" isn't a very good name either. I guess "lookup" is another possibility, and it parallels "LookupError" being the common base class of "IndexError" and "KeyError", but that sounds even less meaningful than "subscript" to me. So, I don't have a good name to offer. One last thing: Would it be worth adding bracket syntax to itemgetter, to make it easier to create slicing functions? (That wouldn't remove the need for this function, or vice versa, but since we're in operator and adding a thing that gets "called" with brackets...)

On Thu, Jun 11, 2015 at 11:38 AM, Andrew Barnert <abarnert@yahoo.com> wrote:
I actually think "subscript" is quite good a name. It makes the explicit distinction between subscripts, indexes and slices. As for itemgetter, with X (placeholder for name we choose), you would just do itemgetter(X[::-1]), so I don't see a need to change itemgetter. - Tal

On 11 June 2015 at 22:57, Tal Einat <taleinat@gmail.com> wrote:
I actually think "subscript" is quite good a name. It makes the explicit distinction between subscripts, indexes and slices.
Yeah, I've warmed to it myself: zero = operator.subscript[0] ellipsis = operator.subscript[...] reverse = slice(None, None, -1) reverse = operator.subscript[::-1] all_rows_first_col = slice(None), slice(0) all_rows_first_col = operator.subscript[:, 0] first_row_all_cols_but_last = slice(0), slice(None, -1) first_row_all_cols_but_last = operator.subscript[0, :-1] I realised the essential problem with using "item" in the name is that the "item" in the method names refers to the *result*, not to the input. Since the unifying term for the different kinds of input is indeed "subscript" (covering indices, slices, multi-dimensional slices, key lookups, content addressable data structures, etc), it makes sense to just use it rather than inventing something new. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 06/10/2015 08:33 AM, Joseph Jevnik wrote:
The basic idea is to provide an alternative constructor to 'slice' that uses the subscript syntax. This allows people to write more understandable code.
+1
-1 Having the old repr makes it possible to see what the equivalent slice() spelling is. -- ~Ethan~

Ethan, I am also not 100% on the new repr, I just wanted to propose this change. In the issue, I have separated that change into it's own patch to make it easier to apply the slice.literal without the repr update. On Wed, Jun 10, 2015 at 3:16 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

On 6/10/2015 11:33 AM, Joseph Jevnik wrote:
Alternate constructors are implemented as class methods. class slice: ... @classmethod def literal(cls, key): if isinstance(key, cls): return key else: else raise ValueError('slice literal mush be slice') They are typically names fromxyz or from_xyz. Tal Einat pointed out that not all keys are slices
I think the first two cases should value errors. The third might be debated, but if allowed, this would not be a slice constructor. -- Terry Jan Reedy
participants (10)
-
Andrew Barnert
-
Antoine Pitrou
-
David Mertz
-
Ethan Furman
-
Greg Ewing
-
Joseph Jevnik
-
Nick Coghlan
-
random832@fastmail.us
-
Tal Einat
-
Terry Reedy