[Python-ideas] Implement comparison operators for range objects

Sat Oct 15 09:04:54 CEST 2011

On Sat, Oct 15, 2011 at 5:52 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> On Fri, Oct 14, 2011 at 1:23 PM, Guido van Rossum <guido at python.org> wrote:
> ..
>> - add read-only attributes .start, .step, .stop
>> - add slicing such that it normalizes .stop to .start + the right
>> multiple of .step
>> - add __eq__ and __hash__ which compare by .start, .step, .stop
>
> -1
>
> I did not see a clear statement of a use-case for any of these
> features.  I could imagine convenience of __eq__ for those used to
> range() returning a list, but comparing by .start, .step, .stop would
> destroy this convenience.  If you need an object with .start, .step,
> .stop, we already have the slice object.  NumPy has some functionality
> to create a regular sequence from a slice object.  I don't see why
> someone would need a __hash__.  If you want to key some values by
> ranges, just use 3-tuples instead.

The key point here is that you can *already* invoke '==' and 'hash()'
on 3.x ranges - they just have useless identity based semantics. The
proposal is merely to make the semantics less pointless for something
you can already do.

It's also a potential step in the ongoing evolution of ranges towards
being more like an optimised tuple of integers (but see my final
comment to Guido below).

The question is how to define the equivalence classes. There are 3
possible sets of equivalence classes available. In order of increasing
size, they are:

1. Identity based (status quo): each range object is equal only to itself
2. Definition based: range objects are equal if their start, stop and
step values are equal
3. Behaviour based: range objects are equal if they produce the same
sequence of values when iterated over

Definitions 2 and 3 produce identical equivalence classes for all
non-empty sequences with a step value of 1 (or -1). They only diverge
when the sequence is empty or the magnitude of the step value exceeds
1.

Under definition 3, all empty ranges form an equivalence class, so
"range(1, 1) == range(2, 2)", just like "(0, 1, 2)[1:1] == (0, 1,
2)[2:2]". Under definition 2, the start/stop/step values matter.

Under definition 3, all ranges that produces the same output (e.g.
just their start value) form an equivalence class, so "range(1, 2, 2)
== range(1, 0, -2)" just like "(0, 1, 2)[1:2:2] == (0, 1, 2)[1:0:-2]".
As with empty ranges, under definition 2, the start/stop/step values
matter.

I'll note that under definition 3 (but with start/stop/step exposed),
it is easy and intuitive to implement definition 2 semantics:
"lhs.start, lhs,stop, lhs.step == rhs.start, rhs.stop, rhs.step"

By contrast, under definition 2, implementing definition 3 requires
the same contortions as it does now: "len(lhs) == len(rhs) and
lhs[0:1] == rhs[0:1] and lhs[-1:] == rhs[-1:]"

Guido, I know you wanted to kill this discussion by declaring that
definition 2 was the way to go, but I *like* the fact that we've been
moving towards a "memory efficient tuple of regularly spaced integers"
interaction model for 3.x range objects, and comparison semantics
based on exact start/stop/step values would be a definitive break from
that model.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia