[Python-ideas] Should range() == range(0)?

Mon May 7 04:20:35 CEST 2012

On 5/6/2012 8:46 PM, Steven D'Aprano wrote:
> Terry Reedy wrote:
>> It is a general principle that if a built-in class C has a unique (up
>> to equality) null object, then C() returns that null object.
>>
>> >>> for f in (bool, int, float, complex, tuple, list, dict, set,
>> frozenset, str, bytes, bytearray):
>> print(bool(f()))
>>
>> # 12 lines of False
>
> I don't think that's so much a general principle that should be aspired
> to as a general observation that many objects have an obvious "nothing"
> (empty) value that intuitively matches the zero-argument case, e.g. set,
> dict, list and so forth.

The general principle, including consistency, *has* been invoked in 
discussions about making the code example above true. It is not just an 
accident. To me, an empty range is nearly as obvious as any other empty 
collection.

> The cases of int, float, complex etc. are a little more dubious; I'm not
> convinced there's a general philosophical reason why int() should be
> allowed at all. E.g. int("") fails, int([]) fails, etc. so there's no
> general principle that the int of "emptiness" is expected to return 0.
>
> The fact that float() has to choose between two zero objects, complex()
> between four, and Fraction and Decimal between an infinity of zero
> objects,

Fraction normalizes all 0 fractions to 0/1, so there is no choice ;-)
 >>> from fractions import Fraction
 >>> Fraction(0, 2)
Fraction(0, 1)
 >>> Fraction()
Fraction(0, 1)

I believe there was consideration given to similarly normalizing ranges 
so that equal ranges (in 3.3, see below) would have the same start, 
stop, and step attributes. But I believe Guido said that recording the 
input might help debugging. Or there might have been some point about 
consistency with slice objects.

If list objects, for instance, had a .source_type attribute (for 
debugging), there would be multiple different but equal empty lists. 
Both [] and list() would then, most sensibly, use list as the default 
.source_type.

 > highlights that the choice of a "default" is at least in part
> an arbitrary choice. If Python has any general principle here, it is
> that we should be reluctant to make arbitrary choices in the face of
> ambiguity.
>
> For the avoidance of doubt, I'm not arguing for changing the behaviour
> of int. The current behaviour is fine. But I don't think we should treat
> it as a general principle that other objects should necessarily follow.

The consistent list above *is* a result of treating the principle as one 
that 'other' classes should follow.

> Are you using Python 2 here? If so, you should be looking at xrange, not
> range. In Python 3, range objects are equal if their start, stop and
> step attributes are equal, not if their output values are equal:
>
> py> range(0) == range(1,1)
> False
> py> range(1, 6, 2) == range(1, 7, 2)
> False

Python 3.3.0a3 (default, May  1 2012, 16:46:00) [MSC v.1500 64 bit 
(AMD64)] on win32
>>> range(0) == range(1,1)
True
>>> range(1, 6, 2) == range(1, 7, 2)
True

I remember there being a discussion about this, which Guido was part of, 
that since ranges are sequences, not their source inputs, == should 
reflect what they are, and not how they came to be. If ranges A and B 
are equal, len(A) == len(B), A[i] == B[i], and iter(A) and iter(B) 
produce the same sequence -- and vice versa.

>> range(0) == range(0, 0, 1) would be the obvious choice for range().
>
> I'm not entirely sure that is quite so obvious. range() defaults to a
> start of 0 and a step of 1, so it's natural to reason that range() =>
> range(0, end, 1). But surely we should treat end to be a required
> argument? If end is not required, that suggests the possibility of
> calling range with (say) a start value only, using the default end and
> step values.
>
> I think there is great value in keeping range simple, and the simplest
> thing is to keep end as a required argument and refuse the temptation to
> guess if it is not given.
>
> I do think this is a line-call though. If I were designing range from
> scratch, I too would be sorely tempted to have range() => range(0).
>
>> Another advantage of doing this, beside consistency, is that it would
>> emphasize that range() produces a re-iterable sequence, not just an
>> iterator.

Sorry, that is mis-worded to the point of being erroneous. I meant to 
say 'non-iterator re-iterable sequence *instead of* an iterator. Just 
like a list or tuple or deque ... .

> I don't follow your reasoning there. Whether range(*args) succeeds or
> fails for some arbitrary value of args has no bearing on whether it is
> re-iterable.

Whether range is an non-iterator iterable sequence or an iterator has 
everything to do with whether it it reiterable.

> Consider zip().

That surprises me. Zip is an one-time iterator, like map, dependent on 
underlying iterables. I wonder whether it is really intentional, or an 
accident of the definition or some mplementation, that zip() returns an 
exhausted iterator instead of raising. In any case, bool(zip()) returns 
True, not False, so it has nothing to do with the return null principle.

>> 6. filter() does not work.
>>
>> While filter is a class, its instances, again, are dependent on
>> another object, not just at creation but during its lifetime.
>> Moreover, bool(empty-iterable) is not False. Ditto for map() and, for
>> instance, open(), even though in the latter case the primary object is
>> external.
>
> Likewise reversed() and iter().

both fail, as I expected.

> sorted() is an interesting case, because although it returns a list
> rather than a (hypothetical) SortedSequence object, it could choose to
> return [] when called with no arguments. I think it is right to not do so.

It is a function, not a class. I would not suggest that all functions of 
one arg should have a default input and therefor a default output. This 
is certainly not a Python design principle.

> zip() on the other hand is a counter-example, and it is informative to
> think about why zip() succeeds while range() fails. zip takes an
> arbitrary number of arguments, where no particular argument is required
> or treated differently from the others. Also there is a unique
> interpretation of zip() with no arguments: an empty zip object (or list
> in the case of Python 2).
>
> Nevertheless, I consider it somewhat surprising that zip() succeeds, and
> don't think that it is a good match for range.

They are not in the same sub-categories of iterables.

-- 
Terry Jan Reedy