
It is a general principle that if a built-in class C has a unique (up to equality) null object, then C() returns that null object.
# 12 lines of False Some imported classes such as fractions.Fraction and collections.deque can be added to the list. I add 'up to equality' because in the case of floats, 0.0 and -0.0 are distinct but equal, and float() returns the obvious 0.0.
The notable exception to the rule is
It is true that there are multiple distinct null range objects (because the defining start,stop,step args are kept as attributes) but they are all equal.
range(1,1) == range(0) True
range(0) == range(0, 0, 1) would be the obvious choice for range(). Another advantage of doing this, beside consistency, is that it would emphasize that range() produces a re-iterable sequence, not just an iterator. Possible objections and responses: 1. This would slightly complicate the already messy code and doc for range(). Pass, for now. 2. There is little need as there is already the alternative. This is just as true or even more true for the other classes. While int() is slightly easier to type than int(), 0 is even easier. 3. There is little or no use case. The justification I have seen for all the other classes behaving as they do is expressions like type(x)(), which gets the null object corresponding to x. This requires a parameterless call rather than a literal (or display) or call with typed arg. A proper objection this sort would have to argue that range() is less useful than all 12+ cases that we have now. 4. memoryview() does not work. Even though memoryview(bytes()) and memoryview(bytearray()) are both False and equal, other empty memoryviews would not all be equal. Besides which, a memoryview is dependent on another object, and there is not reason to create any particular object for it to be dependent on. 5. The dict view methods, such as dict.keys, do not work. These also return views that are dependent on a primary object, and the views also are null if the primary object is. Here there is a unique null primary object, so it would at least be possible to create an empty dict whose only reference is held by a read-only view. On the other hand, '.keys' is a function, not a class. 6. filter() does not work. While filter is a class, its instances, again, are dependent on another object, not just at creation but during its lifetime. Moreover, bool(empty-iterable) is not False. Ditto for map() and, for instance, open(), even though in the latter case the primary object is external. -- Terry Jan Reedy

On 5/6/2012 6:24 PM, Georg Brandl wrote:
Not knowning your definition of 'data-like', it is hard to respond. A range is an immutable, indexable, reiterable sequence of regularly spaced ints with a definite length. It compactly represents an finite but possibly long arithmetic sequence. While mostly used for iteration, it is not limited to iteration. It implements the sequence protocol. It is not an iterator. It is not dependent on an underlying iterable. It is properly documented with the other sequence types. It is most like a bytes object in being an immutable sequence of ints. In that regard, it is different in not restricting the ints to [0,255] while restricting the differences to being equal. (Dict views, especially .keys() are also, to me, somewhat data-like and not limited to iteration. But, unlike ranges, they are dependencies.) -- Terry Jan Reedy

On Sun, May 6, 2012 at 5:24 PM, Terry Reedy <tjreedy@udel.edu> wrote:
How? The empty sequence is the exact case where reiterable objects and iterators have identical iteration behavior. (Both immediately stop every time you try.)
By this, do you mean don't write new documentation? That just defers the problem to later.
Most of the other types are useful as parameters to something such as collections.defaultdict. In the case of range, why not use tuple for this? Although, I actually like this idea, because it feels more consistent. I imagine that isn't a good reason to like things though. -- Devin

On 5/6/2012 7:52 PM, Devin Jeanpierre wrote:
My apology for mis-writing that. A range is a non-iterator, re-iterable sequence rather than an iterator.
That is also true of empty tuples, lists, sets, and dicts. An iterator can only be used to iterate - once. Non-iterator iterables (usually) have other behaviors.
No, it means I was defering discussing this possible objection unless someone raises it as a show-stopper, or it becomes the last issue. The current messiness is that the signature in the doc "range([start], stop[, step])" is non-standard in that it does not follow the rule that optional parameters and arguements follow required ones. It would perhaps be more accurate, but also possibly more confusing, to give it as "range(start_stop, [[stop], [step])", where start_stop is interpreted as start if stop is given and stop if stop is not (otherwise) given. Either version would just need an outer '[]' added: "range([[start], stop, [step]])" and a note "If no arguments are given, return range(0)." For a Python version, adding "= 0" to start_stop in the header should be sufficient. But I do not know how the C version works.
Most of the other types are useful as parameters to something such as collections.defaultdict.
I admit range() would be seemingly useless there.
Although, I actually like this idea, because it feels more consistent. I imagine that isn't a good reason to like things though.
I believe, though, it was a reason for the consistency of everything other than range. -- Terry Jan Reedy

Terry Reedy wrote:
I don't think that's so much a general principle that should be aspired to as a general observation that many objects have an obvious "nothing" (empty) value that intuitively matches the zero-argument case, e.g. set, dict, list and so forth. The cases of int, float, complex etc. are a little more dubious; I'm not convinced there's a general philosophical reason why int() should be allowed at all. E.g. int("") fails, int([]) fails, etc. so there's no general principle that the int of "emptiness" is expected to return 0. The fact that float() has to choose between two zero objects, complex() between four, and Fraction and Decimal between an infinity of zero objects, highlights that the choice of a "default" is at least in part an arbitrary choice. If Python has any general principle here, it is that we should be reluctant to make arbitrary choices in the face of ambiguity. For the avoidance of doubt, I'm not arguing for changing the behaviour of int. The current behaviour is fine. But I don't think we should treat it as a general principle that other objects should necessarily follow.
Are you using Python 2 here? If so, you should be looking at xrange, not range. In Python 3, range objects are equal if their start, stop and step attributes are equal, not if their output values are equal: py> range(0) == range(1,1) False py> range(1, 6, 2) == range(1, 7, 2) False
range(0) == range(0, 0, 1) would be the obvious choice for range().
I'm not entirely sure that is quite so obvious. range() defaults to a start of 0 and a step of 1, so it's natural to reason that range() => range(0, end, 1). But surely we should treat end to be a required argument? If end is not required, that suggests the possibility of calling range with (say) a start value only, using the default end and step values. I think there is great value in keeping range simple, and the simplest thing is to keep end as a required argument and refuse the temptation to guess if it is not given. I do think this is a line-call though. If I were designing range from scratch, I too would be sorely tempted to have range() => range(0).
I don't follow your reasoning there. Whether range(*args) succeeds or fails for some arbitrary value of args has no bearing on whether it is re-iterable. Consider zip().
Likewise reversed() and iter(). sorted() is an interesting case, because although it returns a list rather than a (hypothetical) SortedSequence object, it could choose to return [] when called with no arguments. I think it is right to not do so. zip() on the other hand is a counter-example, and it is informative to think about why zip() succeeds while range() fails. zip takes an arbitrary number of arguments, where no particular argument is required or treated differently from the others. Also there is a unique interpretation of zip() with no arguments: an empty zip object (or list in the case of Python 2). Nevertheless, I consider it somewhat surprising that zip() succeeds, and don't think that it is a good match for range. Given the general principle "the status quo wins", I'm going to vote -0 on the suggested change. -- Steven

On 5/6/2012 8:46 PM, Steven D'Aprano wrote:
The general principle, including consistency, *has* been invoked in discussions about making the code example above true. It is not just an accident. To me, an empty range is nearly as obvious as any other empty collection.
Fraction normalizes all 0 fractions to 0/1, so there is no choice ;-)
I believe there was consideration given to similarly normalizing ranges so that equal ranges (in 3.3, see below) would have the same start, stop, and step attributes. But I believe Guido said that recording the input might help debugging. Or there might have been some point about consistency with slice objects. If list objects, for instance, had a .source_type attribute (for debugging), there would be multiple different but equal empty lists. Both [] and list() would then, most sensibly, use list as the default .source_type.
The consistent list above *is* a result of treating the principle as one that 'other' classes should follow.
Python 3.3.0a3 (default, May 1 2012, 16:46:00) [MSC v.1500 64 bit (AMD64)] on win32
I remember there being a discussion about this, which Guido was part of, that since ranges are sequences, not their source inputs, == should reflect what they are, and not how they came to be. If ranges A and B are equal, len(A) == len(B), A[i] == B[i], and iter(A) and iter(B) produce the same sequence -- and vice versa.
Sorry, that is mis-worded to the point of being erroneous. I meant to say 'non-iterator re-iterable sequence *instead of* an iterator. Just like a list or tuple or deque ... .
Whether range is an non-iterator iterable sequence or an iterator has everything to do with whether it it reiterable.
Consider zip().
That surprises me. Zip is an one-time iterator, like map, dependent on underlying iterables. I wonder whether it is really intentional, or an accident of the definition or some mplementation, that zip() returns an exhausted iterator instead of raising. In any case, bool(zip()) returns True, not False, so it has nothing to do with the return null principle.
both fail, as I expected.
It is a function, not a class. I would not suggest that all functions of one arg should have a default input and therefor a default output. This is certainly not a Python design principle.
They are not in the same sub-categories of iterables. -- Terry Jan Reedy

On 5/6/2012 10:20 PM, Terry Reedy wrote:
On 5/6/2012 8:46 PM, Steven D'Aprano wrote:
I found the change notice in the library manual. "Changed in version 3.3: Define ‘==’ and ‘!=’ to compare range objects based on the sequence of values they define (instead of comparing based on object identity)." That implies, for instance, "range(1,6,2) != range(1,6,2)" in 3.2, which is rather useless. Python slowly improves in many little ways. -- Terry Jan Reedy

Terry Reedy wrote:
That might make sense if there were a well-defined algebra of range objects, but there isn't. For example, concatenating the sequences represented by two ranges with different step sizes results in a sequence that can't be represented by a single range object. Also I can't remember seeing a plethora of use cases for comparing range objects. -- Greg

On Mon, May 7, 2012 at 3:16 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Also I can't remember seeing a plethora of use cases for comparing range objects.
Most of the changes to range() in 3.3 are about making them live up to their claim to implement the Sequence ABC. The approach taken to achieve this is to follow the philosophy that a Python 3.3 range object should behave as much as possible like a memory efficient representation for a tuple of regularly spaced integers (but ignoring the concatenation and repetition operations that tuples support but aren't part of the Sequence ABC). Having range() return an empty range in the same way that tuple() returns an empty tuple would be a natural extension of that philosophy. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 05/07/2012 09:06 AM, Nick Coghlan wrote:
For what gain? At the moment, I cannot think of any arguments in favor of the change, which is the point where arguments against it aren't even needed to keep the status quo. Ah yes: and I would rather have the bug for i in range(): # <- "n" (or equivalent) missing give me an explicit exception than silently "skipping" the loop. After all, the primary use case for range() is loops, and we should not make that use worse for the benefit of hypothetical other use cases. Georg

On Mon, May 7, 2012 at 6:50 AM, Georg Brandl <g.brandl@gmx.net> wrote:
Lack of the default constructor is a pain for generic programming in Python. It is not uncommon to require an arbitrary instance of the given type and calling the type without arguments is a convenient way to get one. I never missed working range() mostly because I don't recall ever using range as an actual type rather than the Python way to spell the C for loop. I do, however often miss default constructors for datetime objects, so I understand why some people may desire range().

Steven D'Aprano wrote:
A philosophical reason would be that list() and int() both return false values. Pragmatically, it makes them useful as arguments to defaultdict. The fact that there is sometimes more than one representation of zero isn't much of a problem, since they all give the same result when you add a nonzero value to them. The defaultdict argument doesn't apply to range() in Python 3, or xrange() in Python 2, since you can't apply += to them. It also doesn't apply much to range() in Python 2, since list would work just as well as a defaultdict argument as a range that accepted no arguments. -- Greg

On 5/6/2012 6:24 PM, Georg Brandl wrote:
Not knowning your definition of 'data-like', it is hard to respond. A range is an immutable, indexable, reiterable sequence of regularly spaced ints with a definite length. It compactly represents an finite but possibly long arithmetic sequence. While mostly used for iteration, it is not limited to iteration. It implements the sequence protocol. It is not an iterator. It is not dependent on an underlying iterable. It is properly documented with the other sequence types. It is most like a bytes object in being an immutable sequence of ints. In that regard, it is different in not restricting the ints to [0,255] while restricting the differences to being equal. (Dict views, especially .keys() are also, to me, somewhat data-like and not limited to iteration. But, unlike ranges, they are dependencies.) -- Terry Jan Reedy

On Sun, May 6, 2012 at 5:24 PM, Terry Reedy <tjreedy@udel.edu> wrote:
How? The empty sequence is the exact case where reiterable objects and iterators have identical iteration behavior. (Both immediately stop every time you try.)
By this, do you mean don't write new documentation? That just defers the problem to later.
Most of the other types are useful as parameters to something such as collections.defaultdict. In the case of range, why not use tuple for this? Although, I actually like this idea, because it feels more consistent. I imagine that isn't a good reason to like things though. -- Devin

On 5/6/2012 7:52 PM, Devin Jeanpierre wrote:
My apology for mis-writing that. A range is a non-iterator, re-iterable sequence rather than an iterator.
That is also true of empty tuples, lists, sets, and dicts. An iterator can only be used to iterate - once. Non-iterator iterables (usually) have other behaviors.
No, it means I was defering discussing this possible objection unless someone raises it as a show-stopper, or it becomes the last issue. The current messiness is that the signature in the doc "range([start], stop[, step])" is non-standard in that it does not follow the rule that optional parameters and arguements follow required ones. It would perhaps be more accurate, but also possibly more confusing, to give it as "range(start_stop, [[stop], [step])", where start_stop is interpreted as start if stop is given and stop if stop is not (otherwise) given. Either version would just need an outer '[]' added: "range([[start], stop, [step]])" and a note "If no arguments are given, return range(0)." For a Python version, adding "= 0" to start_stop in the header should be sufficient. But I do not know how the C version works.
Most of the other types are useful as parameters to something such as collections.defaultdict.
I admit range() would be seemingly useless there.
Although, I actually like this idea, because it feels more consistent. I imagine that isn't a good reason to like things though.
I believe, though, it was a reason for the consistency of everything other than range. -- Terry Jan Reedy

Terry Reedy wrote:
I don't think that's so much a general principle that should be aspired to as a general observation that many objects have an obvious "nothing" (empty) value that intuitively matches the zero-argument case, e.g. set, dict, list and so forth. The cases of int, float, complex etc. are a little more dubious; I'm not convinced there's a general philosophical reason why int() should be allowed at all. E.g. int("") fails, int([]) fails, etc. so there's no general principle that the int of "emptiness" is expected to return 0. The fact that float() has to choose between two zero objects, complex() between four, and Fraction and Decimal between an infinity of zero objects, highlights that the choice of a "default" is at least in part an arbitrary choice. If Python has any general principle here, it is that we should be reluctant to make arbitrary choices in the face of ambiguity. For the avoidance of doubt, I'm not arguing for changing the behaviour of int. The current behaviour is fine. But I don't think we should treat it as a general principle that other objects should necessarily follow.
Are you using Python 2 here? If so, you should be looking at xrange, not range. In Python 3, range objects are equal if their start, stop and step attributes are equal, not if their output values are equal: py> range(0) == range(1,1) False py> range(1, 6, 2) == range(1, 7, 2) False
range(0) == range(0, 0, 1) would be the obvious choice for range().
I'm not entirely sure that is quite so obvious. range() defaults to a start of 0 and a step of 1, so it's natural to reason that range() => range(0, end, 1). But surely we should treat end to be a required argument? If end is not required, that suggests the possibility of calling range with (say) a start value only, using the default end and step values. I think there is great value in keeping range simple, and the simplest thing is to keep end as a required argument and refuse the temptation to guess if it is not given. I do think this is a line-call though. If I were designing range from scratch, I too would be sorely tempted to have range() => range(0).
I don't follow your reasoning there. Whether range(*args) succeeds or fails for some arbitrary value of args has no bearing on whether it is re-iterable. Consider zip().
Likewise reversed() and iter(). sorted() is an interesting case, because although it returns a list rather than a (hypothetical) SortedSequence object, it could choose to return [] when called with no arguments. I think it is right to not do so. zip() on the other hand is a counter-example, and it is informative to think about why zip() succeeds while range() fails. zip takes an arbitrary number of arguments, where no particular argument is required or treated differently from the others. Also there is a unique interpretation of zip() with no arguments: an empty zip object (or list in the case of Python 2). Nevertheless, I consider it somewhat surprising that zip() succeeds, and don't think that it is a good match for range. Given the general principle "the status quo wins", I'm going to vote -0 on the suggested change. -- Steven

On 5/6/2012 8:46 PM, Steven D'Aprano wrote:
The general principle, including consistency, *has* been invoked in discussions about making the code example above true. It is not just an accident. To me, an empty range is nearly as obvious as any other empty collection.
Fraction normalizes all 0 fractions to 0/1, so there is no choice ;-)
I believe there was consideration given to similarly normalizing ranges so that equal ranges (in 3.3, see below) would have the same start, stop, and step attributes. But I believe Guido said that recording the input might help debugging. Or there might have been some point about consistency with slice objects. If list objects, for instance, had a .source_type attribute (for debugging), there would be multiple different but equal empty lists. Both [] and list() would then, most sensibly, use list as the default .source_type.
The consistent list above *is* a result of treating the principle as one that 'other' classes should follow.
Python 3.3.0a3 (default, May 1 2012, 16:46:00) [MSC v.1500 64 bit (AMD64)] on win32
I remember there being a discussion about this, which Guido was part of, that since ranges are sequences, not their source inputs, == should reflect what they are, and not how they came to be. If ranges A and B are equal, len(A) == len(B), A[i] == B[i], and iter(A) and iter(B) produce the same sequence -- and vice versa.
Sorry, that is mis-worded to the point of being erroneous. I meant to say 'non-iterator re-iterable sequence *instead of* an iterator. Just like a list or tuple or deque ... .
Whether range is an non-iterator iterable sequence or an iterator has everything to do with whether it it reiterable.
Consider zip().
That surprises me. Zip is an one-time iterator, like map, dependent on underlying iterables. I wonder whether it is really intentional, or an accident of the definition or some mplementation, that zip() returns an exhausted iterator instead of raising. In any case, bool(zip()) returns True, not False, so it has nothing to do with the return null principle.
both fail, as I expected.
It is a function, not a class. I would not suggest that all functions of one arg should have a default input and therefor a default output. This is certainly not a Python design principle.
They are not in the same sub-categories of iterables. -- Terry Jan Reedy

On 5/6/2012 10:20 PM, Terry Reedy wrote:
On 5/6/2012 8:46 PM, Steven D'Aprano wrote:
I found the change notice in the library manual. "Changed in version 3.3: Define ‘==’ and ‘!=’ to compare range objects based on the sequence of values they define (instead of comparing based on object identity)." That implies, for instance, "range(1,6,2) != range(1,6,2)" in 3.2, which is rather useless. Python slowly improves in many little ways. -- Terry Jan Reedy

Terry Reedy wrote:
That might make sense if there were a well-defined algebra of range objects, but there isn't. For example, concatenating the sequences represented by two ranges with different step sizes results in a sequence that can't be represented by a single range object. Also I can't remember seeing a plethora of use cases for comparing range objects. -- Greg

On Mon, May 7, 2012 at 3:16 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Also I can't remember seeing a plethora of use cases for comparing range objects.
Most of the changes to range() in 3.3 are about making them live up to their claim to implement the Sequence ABC. The approach taken to achieve this is to follow the philosophy that a Python 3.3 range object should behave as much as possible like a memory efficient representation for a tuple of regularly spaced integers (but ignoring the concatenation and repetition operations that tuples support but aren't part of the Sequence ABC). Having range() return an empty range in the same way that tuple() returns an empty tuple would be a natural extension of that philosophy. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 05/07/2012 09:06 AM, Nick Coghlan wrote:
For what gain? At the moment, I cannot think of any arguments in favor of the change, which is the point where arguments against it aren't even needed to keep the status quo. Ah yes: and I would rather have the bug for i in range(): # <- "n" (or equivalent) missing give me an explicit exception than silently "skipping" the loop. After all, the primary use case for range() is loops, and we should not make that use worse for the benefit of hypothetical other use cases. Georg

On Mon, 07 May 2012 13:04:17 -0400 Terry Reedy <tjreedy@udel.edu> wrote:
The fact that there's absolutely no use case to call range() without an argument is enough to dismiss the idea, IMO. Just because something can be done doesn't mean it should be done. Regards Antoine.

On Mon, May 7, 2012 at 6:50 AM, Georg Brandl <g.brandl@gmx.net> wrote:
Lack of the default constructor is a pain for generic programming in Python. It is not uncommon to require an arbitrary instance of the given type and calling the type without arguments is a convenient way to get one. I never missed working range() mostly because I don't recall ever using range as an actual type rather than the Python way to spell the C for loop. I do, however often miss default constructors for datetime objects, so I understand why some people may desire range().

Steven D'Aprano wrote:
A philosophical reason would be that list() and int() both return false values. Pragmatically, it makes them useful as arguments to defaultdict. The fact that there is sometimes more than one representation of zero isn't much of a problem, since they all give the same result when you add a nonzero value to them. The defaultdict argument doesn't apply to range() in Python 3, or xrange() in Python 2, since you can't apply += to them. It also doesn't apply much to range() in Python 2, since list would work just as well as a defaultdict argument as a range that accepted no arguments. -- Greg
participants (8)
-
Alexander Belopolsky
-
Antoine Pitrou
-
Devin Jeanpierre
-
Georg Brandl
-
Greg Ewing
-
Nick Coghlan
-
Steven D'Aprano
-
Terry Reedy