
Viorel Preoteasa asked on python-help today about supporting slices with non-integer indexes, e.g.: foo['a':'abc'] = some_sequence Currently the Python interpreter (in the slice_index function of ceval.c) enforces integer slice indices. I won't pretend to provide motivation for non-integral slice indices. Instead, I've CC'd Viorel and will let him chime in if he feels the need. It does seem to me that if the __setslice__ programmer is willing to do the type checking and provide the semantics of "from X to Y" for aribtrary X and Y that the interpreter should let non-integer indices pass. Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ "Languages that change by catering to the tastes of non-users tend not to do so well." - Doug Landauer

On Fri, 11 Feb 2000, Skip Montanaro wrote:
Currently, a person can do the following: foo[slice('a','abc')] = some_sequence In other words, you have to first wrap the thing into a slice object. Then, it calls the __setitem__ method with the slice object, which can extract the values using the .start, .stop, and .step attributes. Now... altering the syntax and semantic restrictions (to make it easier) is surely possible, but yah: let's hear some motivations from Viorel. Cheers, -g -- Greg Stein, http://www.lyra.org/

Greg> Currently, a person can do the following: Greg> foo[slice('a','abc')] = some_sequence Well, I'll be damned! To wit: import types, string, UserDict class SliceableDict(UserDict.UserDict): def __setitem__(self, index, val): if type(index) == types.SliceType: # val must be a sequence. if it's too short, the last # value is replicated. if it's too long, the extra values # are ignored. # keys between index.start and index.stop are assigned elements # of val - index.step is ignored start = index.start stop = index.stop keys = self.data.keys() keys.sort() j = 0 vl = len(val) for k in keys: if index.start <= k < index.stop: self.data[k] = val[j] j = min(j+1, vl-1) else: self.data[index] = val def init_range(self, keys, val=None): for k in keys: self.data[k] = val d = SliceableDict() d.init_range(string.lowercase[0:13], 7) d[slice('a', 'g')] = [12] print d.data Now, about that motivation... Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ "Languages that change by catering to the tastes of non-users tend not to do so well." - Doug Landauer

Hi Skip, Skip Montanaro wrote:
[nice implementation cut]
Now, about that motivation...
usually I'm not the one to argue against a new feature, but I think this extension to slicing is too much and not consistent enough. When we write x[low:high] = some_sequence then we imply that there is a sequence on the left hand that can be indexed by the implicit ordered set of integers in the range [low, high), and we allow this assignment to change the sequence's length arbitrarily. Speaking of mapping objects, you specify a set of values by an expression of their keys, but you have no way to invent new keys, only deletion applies. Appears a bit twisted to do this to a mapping. A different approach would be to require a mapping object on the right hand. The assignment would have to 1) check that all keys on the right are inside the lice's range 2) delete the entries in that range from the left 3) insert the new keys/values. Indexing a mapping by a slice should return a mapping again. Well, I don't like any of these so much. They make dicts look like something ordered, that rings a bell about too much cheating. Or we could be consequent and provide a sequence protocol for mappings as well, with all that sort-on-demand consequences necessary. But this is not possible since integers can be keys, and it would be undecidable wether we want sequence indexing or mapping indexing. This would only make sense for typed dictionaries, which allow string keys only for instance. I'd say better drop it - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Düppelstr. 31 : *Starship* http://starship.python.net 12163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home

Dear All, Thank you very much for your answers. I will try to give answers to all problems that arise from my question on non-integer slices. First I will comment on any idea that arises from it, and then I will give my example. 1. peter> Hmmmm.... I was very astonished to read this, since it broke a frozen peter> model in my ---wannabe a python guru--- brain. peter> >>> class HyperSeq: peter> ... def __setslice__(self, i, j, sequence): peter> ... print "Just kidding. index i =", i, "j =", j, "seq =", sequence peter> >>> t = HyperSeq() peter> >>> t['a':'abc'] = "does this work?" peter> Traceback (innermost last): peter> File "<stdin>", line 1, in ? peter> TypeError: slice index must be int peter> >>> peter> Now I think, that the model in my brain was not so wrong and that at peter> least in Python 1.5.2 slicing can't work with non-integer indices. skip> Indeed. (Obviously I didn't read the manual or perform the concrete skip> experiment that Peter did.) Before posting my message to python-hep, I have read the manual, I have tried a version of the above example, and I have read the python-FAQ. I have asked the question because I have guessed that may be there is a way to get __setslice__ called even when the indexes are not integers. Any way it is also useful to find out that there is not. I guess that this "frozen model" guides some answers that I got. Why this when somebody can give a consistent semantic for his program, and when there is a function that should deal with this? (Skip's idea) 2. greg> Currently, a person can do the following: greg> foo[slice('a','abc')] = some_sequence Yes he/she can, but it is easier and nicer to have something like foo.slice('a', 'abc', some_sequence) and much nicer foo['a':'abc'] = some_sequence when there is an appropriate semantic for it. 3. skip> Well, I'll be damned! To wit: skip> class SliceableDict(UserDict.UserDict): skip> ... Yes. I also don't have now a good example of data structure like dictionaries that can support such operation. But I have an example of data structure like lists. I guess it has a good semantic for the range between the non-integer indexes. See the example from the end. 4. Christian> When we write Christian> x[low:high] = some_sequence Christian> Christian> then we imply that there is a sequence on the left hand that Christian> can be indexed by the implicit ordered set of integers in Christian> the range [low, high), and we allow this assignment to change Christian> the sequence's length arbitrarily. Almost yes. But why by "implicit ordered set of ...". Why this order cannot depend also on the data that is stored in x. See my example. Christian> Well, I don't like any of these so much. They make dicts look Christian> like something ordered, that rings a bell about too much Christian> cheating. Yes, may be your example is not appropriate for such operation. But <<if the __setslice__ programmer is willing to do the type checking and provide the semantics of "from X to Y" for arbitrary X and Y>> then why the interpreter could not let <<non-integer indices pass>>. As Skip suggested. My example: My example is very simple. I want to have an object that has as data structure lines of, for example, characters. So the basic data type is a list of strings, or list of list of characters. The indexes are pairs of line, column (how Tk Text Widget works). The order between indexes is given by the lexicographic order, i.e. (x,y)<=(u,v) iff x<u or (x=u and y<=v). The range between (x,y) and (u,v) is not given by all pairs between (x,y) and (u,v). Instead, it depends on the actual size of the data represented. For example: class t: def __init__(self): data = ['abcdefgh', '12345', 'xyz'] def __setslice__(...): ... def __getslice__(...): ... __getslice__ could be implemented such that if x is an instance of t, (x = t()), then after the assignment y = x['0.4', '2.1'] y can be ['efgh', '12345', 'x'] or a new instance of t with y.data = ['efgh', '12345', 'x']. It depends on the programer wish. __setslice__ could be implemented such that the assignment x['0.4', '2.1'] = y changes x.data to ['abcdAA', 'BBBBBB', 'CCCCCCC', 'DDDCyz'], where y is ['AA', 'BBBBBB', 'CCCCCCC', 'DDD'], or is an instance of t with y.data = ['AA', 'BBBBBB', 'CCCCCCC', 'DDD']. Sounds consistent? More over: 1. Python allows slices like x[1:100], where x is [1,2,3,4]. This does not implies (as Christian suggested) that x is a sequence that "can be indexed by the implicit ordered set of integers in the range [1, 100)". In fact the number of elements of x[1:100] depends not only on the range(1,100) but also on the actual length of x. So why not allow a more general feature of this. 2. In the case of dictionaries. Some times could be useful to get from a dictionary the elements that have the keys between two elements, supposing that are comparable with the dictionary keys. For example if x = {'john': 4523864, 'andrew': 3745365, 'roland': 4529413, 'anna': 2342231} then print x['a':'b'] would print the only entries that have keys starting with 'a', i.e. {'andrew': 3745365, 'anna': 2342231} in general x[a:b] will be {key: val | (key in x.keys()) and (a <= key < b)}. In this case it is possible as x['a':'b'] = y to have no meaning. But when some body wants to implement something like this, he/she can chose to not define __setslice__. If this function is not defined when "x['a':'b'] = y" occur in a program then it will generate an error like "__setslice__ dot defined". Any way when somebody is writing x[i:j], where x is an instance of an object, it gets an error if the class of x does not implement __???slice__. So when somebody implements __???slice__ for an object then he/she has a semantic for x[i:j], even if i, j are not integers. 3. I would be happy if I would be able to write programs that contains things like: class collection: ... x = collection() y = x[property] were y would become the collection of all elements form x that satisfy the property, i.e. using a mathematical notation y would be the collection {(index, element) in x | property(index, element)} In this example {e in X | p(e)} means the collection of all elements e that belongs to X such that p(e) is true. With this notation slices are particular cases. For example x[i:j] would be x[SLICE(i,j)] where def SLICE(i,j): def _SLICE(k,x, i=i, j=j): return i <= k < j return _SLICE SLICE is not the slice object. If we want to get all odd numbers of a list we will write x[ODD] where def ODD(i,a): return a[i] % 2 == 1 Viorel

[Skip asks about noninteger slices]
foo['a':'abc'] = some_sequence
[Greg writes]
Now... altering the syntax and semantic restrictions (to make it easier) is surely possible, but yah: let's hear some motivations from Viorel.
What about getting full compatibility with jpython <wink> as motivation: JPython 1.1 on java1.3.0rc1 (JIT: null) Copyright (C) 1997-1999 Corporation for National Research Initiatives
regards, finn

On Sat, 12 Feb 2000, Finn Bock wrote:
Typically, CPython is the reference platform. This would indicate that JPython has a bug. Second, section 5.3.3 of the Language Reference states that the upper and lower bound of a slice must be integers. Again, this would indicate that JPython has a bug. :-) -- Greg Stein, http://www.lyra.org/

On Sun, 13 Feb 2000 00:34:24 -0800 (PST), you wrote:
Typically, CPython is the reference platform.
I think your are confusing a large and rich legacy with beeing the reference <0.7 wink>.
This would indicate that JPython has a bug.
At worst, I would call it an overgeneralization.
Is it not only "simple slicing" which calls for integer expressions? So IMHO, it seems that JPython does follows the text of 5.3.3. regards, finn

"GS" == Greg Stein <gstein@lyra.org> writes:
GS> Typically, CPython is the reference platform. This would GS> indicate that JPython has a bug. GS> Second, section 5.3.3 of the Language Reference states that GS> the upper and lower bound of a slice must be integers. Again, GS> this would indicate that JPython has a bug. I think instead you have a case where JimH was (not-so?)subtly trying to push Guido in a certain direction. :) -Barry

On Mon, 14 Feb 2000, Barry A. Warsaw wrote:
Yah... :-) Guido always reserves the right to change the language definition. At the moment, though, it is integers only (if you want to be portable across Python implementations). As MarkH would say: I'm not fussed about it. Cheers, -g -- Greg Stein, http://www.lyra.org/

On Fri, 11 Feb 2000, Skip Montanaro wrote:
Currently, a person can do the following: foo[slice('a','abc')] = some_sequence In other words, you have to first wrap the thing into a slice object. Then, it calls the __setitem__ method with the slice object, which can extract the values using the .start, .stop, and .step attributes. Now... altering the syntax and semantic restrictions (to make it easier) is surely possible, but yah: let's hear some motivations from Viorel. Cheers, -g -- Greg Stein, http://www.lyra.org/

Greg> Currently, a person can do the following: Greg> foo[slice('a','abc')] = some_sequence Well, I'll be damned! To wit: import types, string, UserDict class SliceableDict(UserDict.UserDict): def __setitem__(self, index, val): if type(index) == types.SliceType: # val must be a sequence. if it's too short, the last # value is replicated. if it's too long, the extra values # are ignored. # keys between index.start and index.stop are assigned elements # of val - index.step is ignored start = index.start stop = index.stop keys = self.data.keys() keys.sort() j = 0 vl = len(val) for k in keys: if index.start <= k < index.stop: self.data[k] = val[j] j = min(j+1, vl-1) else: self.data[index] = val def init_range(self, keys, val=None): for k in keys: self.data[k] = val d = SliceableDict() d.init_range(string.lowercase[0:13], 7) d[slice('a', 'g')] = [12] print d.data Now, about that motivation... Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ "Languages that change by catering to the tastes of non-users tend not to do so well." - Doug Landauer

Hi Skip, Skip Montanaro wrote:
[nice implementation cut]
Now, about that motivation...
usually I'm not the one to argue against a new feature, but I think this extension to slicing is too much and not consistent enough. When we write x[low:high] = some_sequence then we imply that there is a sequence on the left hand that can be indexed by the implicit ordered set of integers in the range [low, high), and we allow this assignment to change the sequence's length arbitrarily. Speaking of mapping objects, you specify a set of values by an expression of their keys, but you have no way to invent new keys, only deletion applies. Appears a bit twisted to do this to a mapping. A different approach would be to require a mapping object on the right hand. The assignment would have to 1) check that all keys on the right are inside the lice's range 2) delete the entries in that range from the left 3) insert the new keys/values. Indexing a mapping by a slice should return a mapping again. Well, I don't like any of these so much. They make dicts look like something ordered, that rings a bell about too much cheating. Or we could be consequent and provide a sequence protocol for mappings as well, with all that sort-on-demand consequences necessary. But this is not possible since integers can be keys, and it would be undecidable wether we want sequence indexing or mapping indexing. This would only make sense for typed dictionaries, which allow string keys only for instance. I'd say better drop it - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Düppelstr. 31 : *Starship* http://starship.python.net 12163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home

Dear All, Thank you very much for your answers. I will try to give answers to all problems that arise from my question on non-integer slices. First I will comment on any idea that arises from it, and then I will give my example. 1. peter> Hmmmm.... I was very astonished to read this, since it broke a frozen peter> model in my ---wannabe a python guru--- brain. peter> >>> class HyperSeq: peter> ... def __setslice__(self, i, j, sequence): peter> ... print "Just kidding. index i =", i, "j =", j, "seq =", sequence peter> >>> t = HyperSeq() peter> >>> t['a':'abc'] = "does this work?" peter> Traceback (innermost last): peter> File "<stdin>", line 1, in ? peter> TypeError: slice index must be int peter> >>> peter> Now I think, that the model in my brain was not so wrong and that at peter> least in Python 1.5.2 slicing can't work with non-integer indices. skip> Indeed. (Obviously I didn't read the manual or perform the concrete skip> experiment that Peter did.) Before posting my message to python-hep, I have read the manual, I have tried a version of the above example, and I have read the python-FAQ. I have asked the question because I have guessed that may be there is a way to get __setslice__ called even when the indexes are not integers. Any way it is also useful to find out that there is not. I guess that this "frozen model" guides some answers that I got. Why this when somebody can give a consistent semantic for his program, and when there is a function that should deal with this? (Skip's idea) 2. greg> Currently, a person can do the following: greg> foo[slice('a','abc')] = some_sequence Yes he/she can, but it is easier and nicer to have something like foo.slice('a', 'abc', some_sequence) and much nicer foo['a':'abc'] = some_sequence when there is an appropriate semantic for it. 3. skip> Well, I'll be damned! To wit: skip> class SliceableDict(UserDict.UserDict): skip> ... Yes. I also don't have now a good example of data structure like dictionaries that can support such operation. But I have an example of data structure like lists. I guess it has a good semantic for the range between the non-integer indexes. See the example from the end. 4. Christian> When we write Christian> x[low:high] = some_sequence Christian> Christian> then we imply that there is a sequence on the left hand that Christian> can be indexed by the implicit ordered set of integers in Christian> the range [low, high), and we allow this assignment to change Christian> the sequence's length arbitrarily. Almost yes. But why by "implicit ordered set of ...". Why this order cannot depend also on the data that is stored in x. See my example. Christian> Well, I don't like any of these so much. They make dicts look Christian> like something ordered, that rings a bell about too much Christian> cheating. Yes, may be your example is not appropriate for such operation. But <<if the __setslice__ programmer is willing to do the type checking and provide the semantics of "from X to Y" for arbitrary X and Y>> then why the interpreter could not let <<non-integer indices pass>>. As Skip suggested. My example: My example is very simple. I want to have an object that has as data structure lines of, for example, characters. So the basic data type is a list of strings, or list of list of characters. The indexes are pairs of line, column (how Tk Text Widget works). The order between indexes is given by the lexicographic order, i.e. (x,y)<=(u,v) iff x<u or (x=u and y<=v). The range between (x,y) and (u,v) is not given by all pairs between (x,y) and (u,v). Instead, it depends on the actual size of the data represented. For example: class t: def __init__(self): data = ['abcdefgh', '12345', 'xyz'] def __setslice__(...): ... def __getslice__(...): ... __getslice__ could be implemented such that if x is an instance of t, (x = t()), then after the assignment y = x['0.4', '2.1'] y can be ['efgh', '12345', 'x'] or a new instance of t with y.data = ['efgh', '12345', 'x']. It depends on the programer wish. __setslice__ could be implemented such that the assignment x['0.4', '2.1'] = y changes x.data to ['abcdAA', 'BBBBBB', 'CCCCCCC', 'DDDCyz'], where y is ['AA', 'BBBBBB', 'CCCCCCC', 'DDD'], or is an instance of t with y.data = ['AA', 'BBBBBB', 'CCCCCCC', 'DDD']. Sounds consistent? More over: 1. Python allows slices like x[1:100], where x is [1,2,3,4]. This does not implies (as Christian suggested) that x is a sequence that "can be indexed by the implicit ordered set of integers in the range [1, 100)". In fact the number of elements of x[1:100] depends not only on the range(1,100) but also on the actual length of x. So why not allow a more general feature of this. 2. In the case of dictionaries. Some times could be useful to get from a dictionary the elements that have the keys between two elements, supposing that are comparable with the dictionary keys. For example if x = {'john': 4523864, 'andrew': 3745365, 'roland': 4529413, 'anna': 2342231} then print x['a':'b'] would print the only entries that have keys starting with 'a', i.e. {'andrew': 3745365, 'anna': 2342231} in general x[a:b] will be {key: val | (key in x.keys()) and (a <= key < b)}. In this case it is possible as x['a':'b'] = y to have no meaning. But when some body wants to implement something like this, he/she can chose to not define __setslice__. If this function is not defined when "x['a':'b'] = y" occur in a program then it will generate an error like "__setslice__ dot defined". Any way when somebody is writing x[i:j], where x is an instance of an object, it gets an error if the class of x does not implement __???slice__. So when somebody implements __???slice__ for an object then he/she has a semantic for x[i:j], even if i, j are not integers. 3. I would be happy if I would be able to write programs that contains things like: class collection: ... x = collection() y = x[property] were y would become the collection of all elements form x that satisfy the property, i.e. using a mathematical notation y would be the collection {(index, element) in x | property(index, element)} In this example {e in X | p(e)} means the collection of all elements e that belongs to X such that p(e) is true. With this notation slices are particular cases. For example x[i:j] would be x[SLICE(i,j)] where def SLICE(i,j): def _SLICE(k,x, i=i, j=j): return i <= k < j return _SLICE SLICE is not the slice object. If we want to get all odd numbers of a list we will write x[ODD] where def ODD(i,a): return a[i] % 2 == 1 Viorel

[Skip asks about noninteger slices]
foo['a':'abc'] = some_sequence
[Greg writes]
Now... altering the syntax and semantic restrictions (to make it easier) is surely possible, but yah: let's hear some motivations from Viorel.
What about getting full compatibility with jpython <wink> as motivation: JPython 1.1 on java1.3.0rc1 (JIT: null) Copyright (C) 1997-1999 Corporation for National Research Initiatives
regards, finn

On Sat, 12 Feb 2000, Finn Bock wrote:
Typically, CPython is the reference platform. This would indicate that JPython has a bug. Second, section 5.3.3 of the Language Reference states that the upper and lower bound of a slice must be integers. Again, this would indicate that JPython has a bug. :-) -- Greg Stein, http://www.lyra.org/

On Sun, 13 Feb 2000 00:34:24 -0800 (PST), you wrote:
Typically, CPython is the reference platform.
I think your are confusing a large and rich legacy with beeing the reference <0.7 wink>.
This would indicate that JPython has a bug.
At worst, I would call it an overgeneralization.
Is it not only "simple slicing" which calls for integer expressions? So IMHO, it seems that JPython does follows the text of 5.3.3. regards, finn

"GS" == Greg Stein <gstein@lyra.org> writes:
GS> Typically, CPython is the reference platform. This would GS> indicate that JPython has a bug. GS> Second, section 5.3.3 of the Language Reference states that GS> the upper and lower bound of a slice must be integers. Again, GS> this would indicate that JPython has a bug. I think instead you have a case where JimH was (not-so?)subtly trying to push Guido in a certain direction. :) -Barry

On Mon, 14 Feb 2000, Barry A. Warsaw wrote:
Yah... :-) Guido always reserves the right to change the language definition. At the moment, though, it is integers only (if you want to be portable across Python implementations). As MarkH would say: I'm not fussed about it. Cheers, -g -- Greg Stein, http://www.lyra.org/
participants (6)
-
Barry A. Warsaw
-
bckfnn@worldonline.dk
-
Christian Tismer
-
Greg Stein
-
Skip Montanaro
-
Viorel Preoteasa