On Sat, Oct 15, 2011 at 12:04 AM, Nick Coghlan email@example.com wrote:
On Sat, Oct 15, 2011 at 5:52 AM, Alexander Belopolsky firstname.lastname@example.org wrote:
On Fri, Oct 14, 2011 at 1:23 PM, Guido van Rossum email@example.com wrote: ..
- add read-only attributes .start, .step, .stop
- add slicing such that it normalizes .stop to .start + the right
multiple of .step
- add __eq__ and __hash__ which compare by .start, .step, .stop
I did not see a clear statement of a use-case for any of these features. I could imagine convenience of __eq__ for those used to range() returning a list, but comparing by .start, .step, .stop would destroy this convenience. If you need an object with .start, .step, .stop, we already have the slice object. NumPy has some functionality to create a regular sequence from a slice object. I don't see why someone would need a __hash__. If you want to key some values by ranges, just use 3-tuples instead.
The key point here is that you can *already* invoke '==' and 'hash()' on 3.x ranges - they just have useless identity based semantics. The proposal is merely to make the semantics less pointless for something you can already do.
It's also a potential step in the ongoing evolution of ranges towards being more like an optimised tuple of integers (but see my final comment to Guido below).
The question is how to define the equivalence classes. There are 3 possible sets of equivalence classes available. In order of increasing size, they are:
- Identity based (status quo): each range object is equal only to itself
- Definition based: range objects are equal if their start, stop and
step values are equal 3. Behaviour based: range objects are equal if they produce the same sequence of values when iterated over
Definitions 2 and 3 produce identical equivalence classes for all non-empty sequences with a step value of 1 (or -1). They only diverge when the sequence is empty or the magnitude of the step value exceeds
Under definition 3, all empty ranges form an equivalence class, so "range(1, 1) == range(2, 2)", just like "(0, 1, 2)[1:1] == (0, 1, 2)[2:2]". Under definition 2, the start/stop/step values matter.
Under definition 3, all ranges that produces the same output (e.g. just their start value) form an equivalence class, so "range(1, 2, 2) == range(1, 0, -2)" just like "(0, 1, 2)[1:2:2] == (0, 1, 2)[1:0:-2]". As with empty ranges, under definition 2, the start/stop/step values matter.
I'll note that under definition 3 (but with start/stop/step exposed), it is easy and intuitive to implement definition 2 semantics: "lhs.start, lhs,stop, lhs.step == rhs.start, rhs.stop, rhs.step"
By contrast, under definition 2, implementing definition 3 requires the same contortions as it does now: "len(lhs) == len(rhs) and lhs[0:1] == rhs[0:1] and lhs[-1:] == rhs[-1:]"
Guido, I know you wanted to kill this discussion by declaring that definition 2 was the way to go, but I *like* the fact that we've been moving towards a "memory efficient tuple of regularly spaced integers" interaction model for 3.x range objects, and comparison semantics based on exact start/stop/step values would be a definitive break from that model.
Ok, you've convinced me on range() equality. If I want to compare the start/stop/step triple I can just extract those values and compare those.
I remember in the past thinking about unifying slice() and range() and I couldn't do it. I still can't. I think they should remain separate.