Equality between some of the indexed collections

I'd like to take your opinion on modifying some of the indexed collections like tuples, lists, arrays to evaluate its equality to True when having the same items in the same indexes. Currently, when comparing a list of items to an array of the same items for equality (==) it returns False, I'm thinking that it would make sense to return True in that context, as we're comparing item values and we have the same way of indexing both collections, so we can compare item values. So what do you think about applying such behavior on collections that can be indexed the same way such as tuples, lists, and arrays? Example: (Current) import array tuple_ = (1.1, 2.2, 3.3) list_ = [1.1, 2.2, 3.3] array_ = array.array('f', [1.1, 2.2, 3.3]) # all of the following prints False. print(tuple_ == list_) print(tuple_ == array_) print(array_ == list_) Example: (Proposed): All prints above to show True as they are populated with the same data in the same indexes. A Side Note: An extra point to discuss, based on arrays implementation, array_.to_list() would actually get [1.100000023841858, 2.200000047683716, 3.299999952316284] which is not exactly what we've passed as args and this is normal, but I'm thinking about leaving it to the array implementation to encapsulate that implementation and perform exact equality based on passed arguments.

You can get the desired behavior by casting a list to a tuple, or a tuple to a list, in the equality statement. That way those that rely on the existing implementation don't have to change their code. my_tup = (1, 2, 3) my_list = [1, 2, 3] print(list(my_tup) == my_list) On Sat, May 2, 2020, 9:04 AM Ahmed Amr <ahmedamron@gmail.com> wrote:

I see there are ways to compare them item-wise, I'm suggesting to bake that functionality inside the core implementation of such indexed structures. Also those solutions are direct with tuples and lists, but it wouldn't be as direct with arrays-lists/tuples comparisons for example. On Sat, 2 May 2020, 6:58 pm Antoine Rozo, <antoine.rozo@gmail.com> wrote:

Put this comparison in a function! The current behavior is what I wish '==' to do, and what millions of lines of Python code assume. A tuple is not a list is not an array. I don't want an equality comparison to lie to me. You can write a few lines to implement 'has_same_items(a, b)' that will behave the way you want. On Sat, May 2, 2020, 2:36 PM Ahmed Amr <ahmedamron@gmail.com> wrote:

On Sat, May 2, 2020 at 8:38 PM Ahmed Amr <ahmedamron@gmail.com> wrote:
I'm sure there are times when I would also like this, and others too. But it would be a disastrous break in backwards compatibility, which is why it has 0% chance of happening.
Also those solutions are direct with tuples and lists, but it wouldn't be as direct with arrays-lists/tuples comparisons for example.
It should be. If x and y are two sequences with the same length and the same values at the same indexes, then list(x) == list(y) follows very quickly.

On Sat, May 2, 2020 at 9:51 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
Would we? Is the contract `x == y => hash(x) == hash(y)` still required if hash(y) is an error? What situation involving dicts could lead to a bug if `(1, 2, 3) == [1, 2, 3]` but `hash((1, 2, 3))` is defined and `hash([1, 2, 3])` isn't? The closest example I can think of is that you might think you can do `{(1, 2, 3): 4}[[1, 2, 3]]`, but once you get `TypeError: unhashable type: 'list'` it'd be easy to fix.

It does look like that would violate a basic property of `==` -- if two values compare equal, they should be equally usable as dict keys. I can't think of any counterexamples. On Sat, May 2, 2020 at 1:33 PM Alex Hall <alex.mojaki@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Okay, that's fair. So the argument really comes down to backwards compatibility (which is inconvenient but important). On Sat, May 2, 2020 at 1:51 PM Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Sat, May 2, 2020 at 10:52 PM Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
Nice catch! That's really interesting. Is there reasoning behind `frozenset({1}) == {1}` but `[1] != (1,)`, or is it just an accident of history? Isn't a tuple essentially just a frozenlist? I know the intended semantics of tuples and lists tend to be different, but I'm not sure that's relevant.

On 2020-05-03 10:19 p.m., Steven D'Aprano wrote:
for what it's worth, I see myself using tuples as frozen lists more often than their "intended semantics". more specifically, you can't pass lists to: 1. isinstance 2. issubclass 3. str.endswith among others. so I sometimes just convert a list of strings into a tuple of strings and store it somewhere so I can use it with str.endswith later. (this is not how you're "supposed" to implement domain suffix blocks but w/e)

On Mon, May 4, 2020 at 11:43 AM Soni L. <fakedme+py@gmail.com> wrote:
That doesn't mean you're using a tuple as a frozen list - it means you're using a tuple as a static collection. I've never had a situation where I've wanted to use isinstance with a list that gets built progressively at run-time; it's always a prewritten collection. I don't see what this has to do with lists and tuples. You're using tuples the way they're meant to be used. ChrisA

Right. This isn't an accident. It is by design. Also, some numeric types are specifically designed for cross-type comparison: >>> int(3) == float(3) == complex(3, 0) True And in Python 2, by design, str and unicode were comparable: >>> u'abc' == 'abc' True But the general rule is that objects aren't cross-type comparable by default. We have to specifically enable that behavior when we think it universally makes sense. The modern trend is to avoid cross-type comparability, enumerates and data classes for example: >>> Furniture = Enum('Furniture', ('table', 'chair', 'couch')) >>> HTML = Enum('HTML', ('dl', 'ol', 'ul', 'table')) >>> Furniture.table == HTML.table False >>> A = make_dataclass('A', 'x') >>> B = make_dataclass('B', 'x') >>> A(10) == B(10) False Bytes and str are not comparable in Python 3: >>> b'abc' == 'abc' False
In terms of API, it might look that way. But in terms of use cases, they are less alike: lists-are-looping, tuples-are-for-nonhomongenous-fields. List are like database tables; tuples are like records in the database. Lists are like C arrays; tuples are like structs. On the balance, I think more harm than good would result from making sequence equality not depend on type. Also when needed, it isn't difficult to be explicit that you're converting to a common type to focus on contents: >>> s = bytes([10, 20, 30]) >>> t = (10, 20, 30) >>> list(s) == list(t) When you think about it, it makes sense that a user gets to choose whether equality is determined by contents or by contents and type. For some drinkers, a can of beer is equal to a bottle of bear; for some drinkers, they aren't equal at all ;-) Lastly, when it comes to containers. They each get to make their own rules about what is equal. Dicts compare on contents regardless of order, but OrderedDict requires that the order matches. Raymond

Raymond Hettinger wrote:
`(frozenset() == set()) is True` shocked me. According to wikipedia https://en.wikipedia.org/wiki/Equality_(mathematics): "equality is a relationship between two quantities or, more generally two mathematical expressions, asserting that the quantities have the same value, or that the expressions represent the same mathematical object." If lists and tuples are considered different "mathematical objects" (different types), they cannot be considered equal --tough they can be equivalent, for instance `([1, 2, 3] == list((1, 2, 3)) and tuple([1, 2, 3]) == (1, 2, 3)) is True`. I can only explain `(frozenset() == set()) is True` vs `(list() == tuple()) is False` if: a) `frozenset`s and `set`s are considered the same "mathematical objects". So immutability vs mutability is not a relevant feature in Python equality context. Then, `list() == tuple()` should be `True` if no other feature distinguishes lists from tuples, I suppose... b) language designers found `(frozenset() == set()) is True` convenient (why?). Then, why is not `(list() == tuple()) is True` so convenient? c) it is a bug and `frozenset() == set()` should be `True`.

On Tue, May 05, 2020 at 09:34:28AM -0000, jdveiga@gmail.com wrote:
`(frozenset() == set()) is True` shocked me.
According to wikipedia https://en.wikipedia.org/wiki/Equality_(mathematics): "equality is a relationship between two quantities or, more generally two mathematical expressions, asserting that the quantities have the same value, or that the expressions represent the same mathematical object."
There is no good correspondence between "mathematical objects" and types. Even in mathematics, it is not clear whether the integer 1 as the same mathematical object as the real number 1, or the complex number 1, or the quaternion 1. In Python, we usually say that if a type is part of the numeric tower ABC, then instances with the same numeric value should be considered equal even if they have different types. But that's not a hard rule, just a guideline. And it certainly shouldn't be used as a precedent implying that non-numeric values should behave the same way. If you are looking for a single overriding consistant principle for equality in Python, I think you are going to be disappointed. Python does not pretend to be a formal mathematically consistent language and the only principle for equality in Python is that equality means whatever the object's `__eq__` method wants it to mean.
List and tuple are distinguished by the most important feature of all: the designer's intent. Tuples are records or structs, not frozen lists, which is why they are called tuple not frozen list :-) even if people use them as a defacto frozen list. On the other hand, frozensets are frozen sets, which is why they compare equal. Does this make 100% perfectly logical sense? Probably not. But it doesn't have to. Lists and tuples are considered to be independent kinds of thing, while sets and frozensets are considered to be fundamentally the same kind of thing differentiated by mutability. (In hindsight, it might have been more logically clear if mutable sets inherited from immutable frozensets, but we missed the chance to do that.) -- Steven

Steven D'Aprano wrote:
Thanks for your reply. I do not expect any kind of full correspondence between mathematical objects and programming objects. Just reasoning by analogy and trying to understand how lists and tuples cannot be equal and frozensets and sets can be on similar grounds. Mostly asking than answering. Designers' intent is an admissible answer, of course. A cat and a dog can be equal if equality is defined as "having the same name". However, designers' intent is one thing, and users' understating is another one. From your words, I have learnt that --from designers' point of view-- tuples are different from lists in their nature while sets and frozensets are mostly the same kind of thing --roughly speaking of course... I wonder if users share that view. I feel that it is not unreasonable to expect that frozenset and set cannot be equal on the grounds that they are different types (as tuples and lists are different types too). From that perspective, equality on tuples / lists and frozensets / sets should follow similar rules. Not being that way is surprising. That is all. However, if sets and frozensets are "are considered to be fundamentally the same kind of thing differentiated by mutability", as you said, why not tuples and lists? And that is, I guess, the reasoning behind proponent's claim. What if the difference between tuples and lists is not so deep or relevant and they just differ on mutability? Asking again...

On 6/05/20 2:22 am, jdveiga@gmail.com wrote:
I think that can be answered by looking at the mathematical heritage of the types involved: Python Mathematics ------ ----------- set set frozenset set tuple tuple list sequence Sets and frozensets are both modelled after mathematical sets, so to me at least it's not surprising that they behave very similarly, and are interchangeable for many purposes. To a mathematician, however, tuples and sequences are very different things. Python treating tuples as sequences is a "practicality beats purity" kind of thing, not to be expected from a mathematical point of view. -- Greg

On Wed, 6 May 2020 at 01:41, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I don't think that is accurate to represent as a representation of "a mathematician". The top voted answer here disagrees: https://math.stackexchange.com/questions/122595/whats-the-difference-between... "A sequence requires each element to be of the same type. A tuple can have elements with different types." The common usage for both is: you have a tuple of (Z, +) representing the Abelian group of addition (+) on the integers (Z), whereas you have the sequence {1/n}_{n \in N} converging to 0 in the space Q^N (rational infinite sequences) for example. I'd say the difference is just one of semantics and as a mathematician I would consider tuples and sequences as "isomorphic", in fact, the set-theoretical construction of tuples as functions is *identical* to the usual definition of sequences: i.e. they are just two interpretations of the the same object depending on your point of view.

On Wed, May 06, 2020 at 02:58:01AM +0100, Henk-Jaap Wagenaar wrote:
Are you saying that you can't have a sequence that alternates between ints and rationals, say, or ints and surds (reals)? The sequence A_n = sqrt(n) from n=0 starts off int, int, real, ... so there is that. For what its worth, Wolfram Mathworld disagrees with both Greg's comment and the stackexchange answer, stating that a tuple is just a synonym for a list, and that both lists and sequences are ordered sets: https://mathworld.wolfram.com/n-Tuple.html https://mathworld.wolfram.com/List.html https://mathworld.wolfram.com/Sequence.html
One can come up with many other usages. I think a far more common use for tuples are the ordered pairs used for coordinates: (1, 2) So although tuples are ordered sets, and sequences are ordered sets, the way they are used is very different. One would not call the coordinate (1, 2) a sequence 1 followed by 2, and one would not normally consider a sequence such as [0, 2, 4, 6, 8, ...] to be a tuple. In normal use, a tuple is considered to be an atomic[1] object (e.g. a point in space), while a sequence is, in a sense, a kind of iterative process that has been reified.
I'd say the difference is just one of semantics
The difference between any two things is always one of semantics.
Many things are isomorphic. "Prime numbers greater than a googolplex" are isomorphic to the partial sums of the sequence 1/2 − 1/4 + 1/8 − 1/16 + ⋯ = 1/3 but that doesn't mean you could use 1/2 * 1/4 as your RSA public key :-) [1] I used that term intentionally, since we know that if you hit an atom hard enough, it ceases to be indivisible and can split apart :-) -- Steven

On Wed, May 06, 2020 at 07:15:22PM +1200, Greg Ewing wrote:
Oh I'm not agreeing with them, I'm just pointing out that the people who hang around math.stackexchange and the people who write for Mathworld don't agree. It is difficult to capture all the nuances of common usage in a short definition. Based purely on dictionary definitions, 'The Strolling Useless' is precisely the same meaning as 'The Walking Dead' but no native English speaker would confuse the two, and I'm pretty sure that few mathematicians would call the origin of the Cartesian Plane "a sequence" even if it does meet the definition perfectly :-) -- Steven

TL;DR: the maths does not matter. Programming language (design)/computer science/data structures should lead this discussion! Also, -1 on this proposal, -1000 on having it apply to strings. Feel free to read on if you want to hear some ramblings of somebody who does not get to use their academic knowledge of maths enough seeing an opportunity... On Wed, 6 May 2020 at 07:18, Steven D'Aprano <steve@pearwood.info> wrote:
That's a sequence in the reals (or algebraics or some other set that contains square roots), of which a subsequence also happens to live in the integers. A square is still a rectangle.
These two above pertain to data structures in computer science, not mathematics. An "ordered set" is not a mathematical term I have every come across, but if it is, it means exactly as how they define a sequence (though you would have to extend it to infinite sequences to allow infinite ordered sets):
The notation ([image: a_1], [image: a_2], ..., [image: a_n]) is the same as saying it is a sequence in some set X^n (if not given an X, setting X = {a_1, ..., a_n} works, is that cheating? Yes. Is that a problem in set-theoretic mathematics? Not in this case anyway)
I would call that an ordered pair, or, a sequence of length 2.
I would not use the word "tuple", in my experience, tuple in mathematics (not computer science!) is only used in the way I described it: to gather up the information about a structure into one object, so that we can say it exists: because existing means some kind of set exists, and so we need to somehow codify for e.g. addition on the integers both the addition and the integers, i.e. combining two wholly different things into one 2-sequence: (Z, +). Note that such structures might require infinite tuples, e.g. if they support infinitely many operators. Anyway, this is where the StackOverflow answer comes from: tuples are used in parlance for sequences are in the same "space" for their coordinates, sequences for things that have all coordinates in the same "space".
You can construct a sequence (or tuple) iteratively, but whether you do or not has no bearing on the end result. Also, tuples are very much not atomic in the mathematical sense. I would also like to note when you say "a tuple is considered to be an atomic[1] object (e.g. a point in space)", then to a mathematician, A_n = 1/sqrt(n) for n = 0, 1, ... is simply a point in space too: just the space of sequences over the reals. Mathematicians (generally, in the field of foundational logic it is a tad different) don't tend to be concerned with differences such as how you define an object (just need to make sure it exists), whether things are finite or infinite, specified or unspecified. Unfortunately, in real life, in a programming language, we do have to care about these things.
But in this case, it does mean, that claiming that the mathematical point of view is that tuples and lists are different due to them being based on tuples and sequences in mathematics is flawed. An alpha-tuple is the same as an alpha-sequence in mathematics and is an element of X^alpha for some X. That's not a random isomorphism, that's a canonical one, which is a big difference with your isomorphism. I strongly oppose any idea that mathematics supports distinction between tuples and lists, however my main point is that it does not even matter. Python should not be led by mathematical conventions/structures in this decision or in others, and it has not in the past: I can have a set {0} in Python and another {0} and they will be different, in mathematics that makes no sense: those two sets are the same, similarly, Programming languages are not mathematics and mathematics is not a programming language and when it comes to comparing things as equal, using it as an example is a losing battle. It should instead look towards computer science, other programming languages and actual use cases to determine what is best. On a personal note, I prefer the current behaviour: e.g. tuples will not successively compare with a list ever not because it is consistent with any mathematical property, but because I think it is a good design for Python/a programming language. On Wed, 6 May 2020 at 08:05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I would genuinely be interested to be linked some examples of tuples being used in a different way in mathematics, not computer science (e.g. the links by Stephen were in computer science data structures)
So... say you are solving a problem in 1d, you do that on a real number x, right? Now you solve it in 2d, so you do your work on a pair (x, y), then you might solve it in 3d and do your work on a triplet (x, y, z). A few days later you generalize it to n-dimensions and you get a *sequence* (I would not use the word tuple here, nor have I ever seen it used, though I do not dispute some might) (x_1, ..., x_n) that you work on. Then, a few days later you generalize it to infinite sequences (x_1, x_2, ...). Surely we are done now? No way, we can do that again, so we can make it instead a doubly infinite sequence (x_1, x_2, ..., x_omega, x_(omega + 1), ..., x_(2 * omega)) and so on (we are now inducting over the ordinals slowly but surely, see https://en.wikipedia.org/wiki/Ordinal_number#Transfinite_sequence) Anyway, my point is, to me, all those objects, from x, to (x, y, z), all the way to (x_1, x_2, ..., x_omega, x_(omega + 1), ..., x_(2 * omega)) look the same to me: they are all {x_n}_{n in I} for different index sets I and all with coordinates in X = R, the real numbers. To a programming language/programmer/computer science these are all quite different, require different implementations, restrictions and idea Make no difference to a mathematician how long it is, or even whether "long" makes sense (e.g. you can interpret a function f: R -> R as a sequence {f(x)}_{x in R} instead, which, sometimes might be more insightful than the other interpretation).
Labelling the elements of a tuple? Again, in my view programming/computer science speak coming in again: that does not really make sense when it comes to mathematics. Once again, my point is that mathematics does not matter: the data structure that is a tuple in Python does not correspond to a tuple in mathematics, and that does not even make sense, because mathematics does not deal with data structures. I think also, in the case/example you are describing, you really want a namedtuple (as I would want to describe structures in mathematics too), not a plain tuple, if the order is not what is important.

On 6/05/20 7:45 pm, Henk-Jaap Wagenaar wrote:
At this point I would say that you haven't created an infinite tuple, you've created an infinite sequence of finite tuples.
Then, a few days later you generalize it to infinite sequences (x_1, x_2, ...).
Now here I would stop and say, wait a minute, what does this proof look like? I'm willing to bet it involves things that assume some kind of intrinsic order to the elements of this "tuple". If it does, and it's an extension to the finite dimensional cases, then I would say you were really dealing with sequences, not tuples, right from the beginning. Now I must admit I was a bit hesitant about writing that statement, because in quantum theory, for example, one often deals with vector spaces having infinitely many dimensions. You could consider an element of such a space as being an infinite tuple. However, to even talk about such an object, you need to be able to write formulas involving the "nth element", and those formulas will necessarily depend on the numerical value of n. This gives the elements an intrinsic order, and they will have relationships to each other that depend on that order. This makes the object more like a sequence than a tuple. Contrast this with, for example, a tuple (x, y, z) representing coordinates in a geometrical space. There is no inherent sense in which the x coordinate comes "before" the y coordinate; that's just an accident of the order we chose to write them down in. We could have chosen any other order, and as long as we were consistent about it, everything would still work. This, I think, is the essence of the distinction between tuples and sequences in mathematics. Elements of sequences have an inherent order, whereas elements of a tuple have at best an arbitrarily-imposed order. -- Greg

Greg Ewing wrote:
However, in Python, tuples and lists are both sequences, ordered sets of elements. So it is not completely unreasoned to see them as Ahmed Amr is proposing: that is, so similar types that you can expect that if they have the same element, they are equal. (Like frozensets and sets in the "set type" domain). Indeed, tuples and lists are equivalent in Python: `(list() == list(tuple()) and tuple(list()) == tuple()) is True`. Do not misunderstand me. I agree with the idea that tuples and lists are different by design while frozenset and sets are not (as Steven D'Aprano pointed out in a previous posts). But considering tuples and lists as just ordered sets of elements and based their equality on their elements, not in their type, is an appealing idea. I think that some Pythonists would not disagree. A different thing is the practicality of this.

On 6/05/20 1:58 pm, Henk-Jaap Wagenaar wrote:
Maybe the small subset of mathematicians that concern themselves with trying to define everything in terms of sets, but I don't think the majority of mathematicians think like that in their everyday work. It's certainly at odds with the way I see tuples and sequences being used in mathematics. As well as the same type vs. different types thing, here are some other salient differences: - Infinite sequences make sense, infinite tuples not so much. - Sequences are fundamentally ordered, whereas tuples are not ordered in the same sense. Any apparent ordering in a tuple is an artifact of the way we conventionally write them. If we were in the habit of labelling the elements of a tuple and writing things like (x:1, y:2, z:3) then we wouldn't have to write them in any particular order -- (y:2, x:1, z:3) would be the same tuple. -- Greg

On 5/6/20 3:04 AM, Greg Ewing wrote:
In my mind, tuples and lists seem very different concepts, that just happen to work similarly at a low level (and because of that, are sometimes 'misused' as each other because it happens to 'work'). To me, tuples are things when the position of the thing very much matters, you understand the meaning of the Nth element of a tuple because it IS the Nth element of the tuple. It isn't so important that the Nth is after the (N-1)th element, so we could define our universe of tuples in a different order then it might still make sense, but we then need to reorder ALL the tuples of that type. A coordinate makes a great example of a tuple, we think of the 1st element of the coordinate as 'X' due to convention, and in the tuple it gets in meaning from its position in the tuple. A list on the other hand is generally not thought of in that way. A list might not be ordered, or it might be, and maybe there is SOME value in knowing that an item is the Nth on the list, but if it is an ordered list, it is generally more meaningful to think of the Nth item in relation to the (N-1)th and (N+1)th items. Adding an element to a tuple generally doesn't make sense (unless it is transforming it to a new type of tuple, like from 2d to 3d), but generally adding an item to a list does. This makes their concepts very different. Yes, you might 'freeze' a list by making it a tuple so it becomes hashable, but then you are really thinking of it as a 'frozen list' not really a tuple. And there may be times you make a mutable tuple by using a list, but then you are thinking of it as a mutable tuple, not a list. And these are exceptional cases, not the norm. -- Richard Damon

On May 6, 2020, at 05:22, Richard Damon <Richard@damon-family.org> wrote:
I think this thread has gotten off track, and this is really the key issue here. If someone wants this proposal, it’s because they believe it’s _not_ a misuse to use a tuple as a frozen list (or a list as a mutable tuple). If someone doesn’t want this proposal, the most likely reason (although admittedly there are others) is because they believe it _is_ a misuse to use a tuple as a frozen list. It’s not always a misuse; it’s sometimes perfectly idiomatic to use a tuple as an immutable hashable sequence. It doesn’t just happen to 'work', it works, for principled reasons (tuple is a Sequence), and this is a good thing.[1] It’s just that it’s _also_ common (probably a lot more common, but even that isn’t necessary) to use it as an anonymous struct. So, the OP is right that (1,2,3)==[1,2,3] would sometimes be handy, the opponents are right that it would often be misleading, and the question isn’t which one is right, it’s just how often is often. And the answer is obviously: often enough that it can’t be ignored. And that’s all that matters here. And that’s why tuple is different from frozenset. Very few uses of frozenset are as something other than a frozen set, so it’s almost never misleading that frozensets equal sets; plenty of tuples aren’t frozen lists, so it would often be misleading if tuples equaled lists. —- [1] If anyone still wants to argue that using a tuple as a hashable sequence instead of an anonymous struct is wrong, how would you change this excerpt of code: memomean = memoize(mean, key=tuple) def player_stats(player): # … … = memomean(player.scores) … # … Player.scores is a list of ints, and a new one is appended after each match, so a list is clearly the right thing. But you can’t use a list as a cache key. You need a hashable sequence of the same values. And the way to spell that in Python is tuple. And that’s not a design flaw in Python, it’s a feature. (Shimmer is a floor wax _and_ a dessert topping!) Sure, when you see a tuple, the default first guess is that it’s an anonymous struct—but when it isn’t, it’s usually so obvious from context that you don’t even have to think about it. It’s confusing a lot less often than, say, str, and it’s helpful a lot more often.

On Fri, 8 May 2020 17:40:31 -0700 Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
That's a good summary. Thank you. :-)
Very clever. Then again, it wouldn't be python-ideas if it were that simple! "hashable sequence of the same values" is too strict. I think all memoize needs is a key function such that if x != y, then key(x) != key(y). def key(scores): ','.join(str(-score * 42) for score in scores) memomean = memoize(mean, key=key) def player_stats(player): # … … = memomean(player.scores) … # … Oh, wait, even that's too strict. All memoize really needs is if mean(x) != mean(y), then key(x) != key(y): memomean = memoize(mean, key=mean) def player_stats(player): # … … = memomean(player.scores) … # … But we won't go there. ;-)

On May 8, 2020, at 20:36, Dan Sommers <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
I don’t think it’s particularly clever. And that’s fine—using common idioms usually is one of the least clever ways to do something out of the infinite number of possible ways. Because being intuitively the one obvious way tends to be important to becoming an idiom, and it tends to run counter to being clever. (Being concise, using well-tested code, and being efficient are also often important, but being clever doesn’t automatically give you any of those.)
Well, it does have to be hashable. (Unless you’re proposing to also replace the dict with an alist or something?) I suppose it only needs to be a hashable _encoding_ of a sequence of the same values, but surely the simplest encoding of a sequence is the sequence itself, so, unless “hashable sequence” is impossible (which it obviously isn’t), who cares?
def key(scores): ','.join(str(-score * 42) for score in scores)
This is still a sequence. If you really want to get clever, why not: def key(scores): return sum(prime**score for prime, score in zip(calcprimes(), scores)) But this just demonstrates why you don’t really want to get clever. It’s more code to write, read, and debug than tuple, easier to get wrong, harder to understand, and almost certainly slower, and the only advantage is that it deliberately avoids meeting a requirement that we technically didn’t need but got for free.
Well, it seems pretty unlikely that calculating the mean to use it as a cache key will be more efficient than just calculating the mean, but hey, if you’ve got benchmarks, benchmarks always win. :) (In fact, I predicted that memoizing here would be a waste of time in the first place, because the only players likely to have equal score lists to earlier players would be the ones with really short lists—but someone wanted to try it anyway, and he was able to show that it did speed up the script on our test data set by something like 10%. Not nearly as much as he’d hoped, but still enough that it was hard to argue against keeping it.)

Thanks Andrew for the excellent analysis quoted below. Further comments interleaved with yours. On Fri, May 08, 2020 at 05:40:31PM -0700, Andrew Barnert via Python-ideas wrote:
I don't think it is necessary to believe that it is *always* misuse, but only that it is *often* misuse and therefore `==` ought to take the conservative position and refuse to guess. I expect that nearly every Python programmer of sufficient experience has used a tuple as a de facto "frozen list" because it works and practicality beats purity. But that doesn't mean that I want my namedtuple PlayerStats(STR=10, DEX=12, INT=13, CON=9, WIS=8, CHR=12) to compare equal to my list [10, 12, 13, 9, 8, 12] by default.
Yes, I think there's a genuine need here.
-- Steven

On Tue, May 5, 2020 at 7:36 AM Raymond Hettinger < raymond.hettinger@gmail.com> wrote:
Right, that's what I'm referring to. If you're comparing two things which are meant to represent completely different entities (say, comparing a record to a table) then your code is probably completely broken (why would you be doing that?) and having equality return False isn't going to fix that. Conversely I can't see how returning True could break a program that would work correctly otherwise. If you're comparing a list and a tuple, and you haven't completely screwed up, you probably mean to compare the elements and you made a small mistake, e.g. you used the wrong brackets, or you forgot that *args produces a tuple.

On Sat, May 2, 2020 at 10:36 PM Guido van Rossum <guido@python.org> wrote:
It does look like that would violate a basic property of `==` -- if two values compare equal, they should be equally usable as dict keys.
It's certainly a reasonable property, but I don't think it's critical. By comparison, if it was the case that `(1, 2, 3) == [1, 2, 3]` and `hash((1, 2, 3)) != hash([1, 2, 3])` were both True without raising exceptions, that would be a disaster and lead to awful bugs. The equality/hash contract is meant to protect against that.
I can't think of any counterexamples.
I think it's reasonable that this change would introduce counterexamples where none previously existed, as we would be changing the meaning of ==. Although since writing this Dominik gave the frozenset example. I also think it'd be possible to have a data model where `{(1, 2, 3): 4}[[1, 2, 3]]` does work. You'd need a way to calculate a hash if you promised to use it only for `__getitem__`, not `__setitem__`, so you can't store list keys but you can access with them. (this is all just fun theoretical discussion, I'm still not supporting the proposal)

02.05.20 23:32, Alex Hall пише:
You are probably right. Here is other example: if make all sequences comparable by content, we would need to make `('a', 'b', 'c') == 'abc'` and `hash(('a', 'b', 'c')) == hash('abc')`. It may be deifficult to get the latter taking into account hash randomization.

Thanks, I do appreciate all the discussion here about that. Initially, I was thinking about having lists/arrays/tuples match the behavior of other instances in python that compare across their types like: 1) Sets (instances of set or frozenset) can be compared within and across their types As Dominic mentioned. 2) Numeric types do compare across their types along with fractions.Fraction and decimal.Decimal. 3) Binary Sequences( instances of bytes or bytearray) can be compared within and across their types (All points above stated in python reference in https://docs.python.org/3/reference/expressions.html) but after the discussion here, I think backword compatibility dominates for sure against that, Thanks!

On 5/3/20 8:40 AM, Ahmed Amr wrote:
I think the issue is that the set/frozen set distinction (and bytes/bytes array) is a much finer distinction than between arbitrary sequence types, as it is primarily just a change of mutability (and hash-ability), and all the Numeric types are really just slight different abstractions of the same basic set of values (or subsets thereof). The various containers don't have the same concept that they are essentially representing the same 'thing' with just a change in representation to control the types sort of numbers they can express and what sort of numeric errors the might contain (so two representations that map to the same abstract number make sense to be equal) Different types of sequences are more different in what they likely represent, so it is less natural for different sequences of the same value to be thought of as always being 'the same' There may be enough cases where that equality is reasonable, that having a 'standard' function to perform that comparison might make sense, it just isn't likely to be spelled ==. There are several questions on how to do thing that might need to be explored, Should the ignoring of sequence type be recurcively ignored or not, i.e. is [1, [2, 3]] the same as (1, (2, 3)) or not, and are strings just another sequence type, or something more fundamental. This doesn't make it a 'bad' idea, just a bit more complicated and in need of exploration. -- Richard Damon

On Sat, 2 May 2020 at 20:50, Serhiy Storchaka <storchaka@gmail.com> wrote:
This is the key point. Much of the other discussion in this thread seems to be bogged down in the mathematical interpretation of tuples and sequences but if I was to take something from maths here it would be the substitution principle of equality: https://en.wikipedia.org/wiki/Equality_(mathematics)#Basic_properties What the substitution principle essentially says is if x == y then f(x) == f(y) for any function f such that f(x) is well defined. What that means is that I should be able to substitute x for y in any context where x would work without any change of behaviour. We don't need to do any deep maths to see how that principle can be applied in Python but if you try to follow it rigorously then you'll see that there are already counterexamples in the language for example
Given a list x and a tuple y with equivalent elements x and y will not be interchangeable because one is not hashable and the other is not mutable so there are functions where one is usable but the other is not. Following the same reasoning set/frozenset should not compare equal. In SymPy there are many different mathematical objects that people feel should (on mathematical grounds) compare "equal". This happens enough that there is a section explaining this in the tutorial: https://docs.sympy.org/latest/tutorial/gotchas.html#equals-signs The terms "structural equality" and "mathematical equality" are used to distinguish the different kinds of equality with == being used for the structural sense. For example the sympy expression Pow(2, 2, evaluate=False) gives an object that looks like 2**2. This does mathematically represent the number 4 but the expression itself is not literally the number 4 so the two expressions are mathematically equal but not structurally equal:
This distinction is important because at the programmatic level p and 4 are not interchangeable. For example p being a Pow has attributes base and exp that 4 will not have. In sympy most objects are immutable and hashable and are heavily used in sets and dicts. Following the substitution principle matters not least because Python has baked the use of ==/__eq__ into low-level data structures so objects that compare equal with == will literally be interchanged:
All the same many sympy contributors have felt the need to define __eq__ methods that will make objects of different types compare equal and there are still examples in the sympy codebase. These __eq__ methods *always* lead to bugs down the line though (just a matter of time). I've come to the seemingly obvious conclusion that if there is *any* difference between x and y then it's always better to say that x != y. Oscar

On Thu, May 7, 2020 at 10:33 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I've come to the seemingly obvious conclusion that if there is *any* difference between x and y then it's always better to say that x != y.
And having worked in languages where floats and integers are fundamentally different beasts, I disagree: it is extremely practical (even if not pure) to have them compare equal. SourcePawn (the language generally used for modding games like Counter-Strike) is strongly-typed and does not allow floats and ints to be used interchangeably - except that you can do arithmetic and they'll be type-folded. So if you have a function TakeDamage that expects a floating-point amount of damage, and another function GetHealth that returns the player's health as an integer, you have to add 0.0 to the integer before it can be used as a float. Actual line of code from one of my mods: SDKHooks_TakeDamage(client, inflictor, attacker, GetClientHealth(client) + 0.0, 0, weapon); Every language has to choose where it lands on the spectrum of "weak typing" (everything can be converted implicitly) to "strong typing" (explicit conversions only), and quite frankly, both extremes are generally unusable. Python tends toward the stricter side, but with an idea of "type" that is at times abstract (eg "iterable" which can cover a wide variety of concrete types); and one of those very important flexibilities is that numbers that represent the same value can be used broadly interchangeably. This is a very good thing. ChrisA

I'm afraid, Oscar, that you seem to have painted yourself into a reductio ad absurdum. We need a healthy dose of "practicality beats purity" thrown in here. What the substitution principle essentially says is
I'm very happy to agree that "but id() isn't the kind of function I meant!" That's the point though. For *most* functions, the substitution principle is fine in Python. A whole lot of the time, numeric functions can take either an int or a float that are equal to each other and produce results that are equal to each other. Yes, I can write something that will sometimes overflow for floats but not ints. Yes, I can write something where a rounding error will pop up differently between the types. But generally, numeric functions are "mostly the same most of the time" with float vs. int arguments. This doesn't say whether tuple is as similar to list as frozenset is to set. But the answer to that isn't going to be answered by examples constructed to deliberately obtain (non-)substitutability for the sake of argument. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Thu, 7 May 2020 at 02:07, David Mertz <mertz@gnosis.cx> wrote:
That's the point though. For *most* functions, the substitution principle is fine in Python. A whole lot of the time, numeric functions can take either an int or a float that are equal to each other and produce results that are equal to each other. Yes, I can write something that will sometimes overflow for floats but not ints. Yes, I can write something where a rounding error will pop up differently between the types. But generally, numeric functions are "mostly the same most of the time" with float vs. int arguments.
The question is whether you (or Chris) care about calculating things accurately with floats or ints. If you do try to write careful code that calculates things for one or the other you'll realise that there is no way to duck-type anything nontrivial because the algorithms for exact vs inexact or bounded vs unbounded arithmetic are very different (e.g. sum vs fsum). If you are not so concerned about that then you might say that 1 and 1.0 are "acceptably interchangeable". Please understand though that I am not proposing that 1==1.0 should be changed. It is supposed to be a simple example of the knock on effect of defining __eq__ between non-equivalent objects.
This doesn't say whether tuple is as similar to list as frozenset is to set. But the answer to that isn't going to be answered by examples constructed to deliberately obtain (non-)substitutability for the sake of argument.
Those examples are not for the sake of argument: they are simple illustrations. I have fixed enough real examples of bugs relating to this to come to the conclusion that making non-interchangeable objects compare equal with == is an attractive nuisance. It seems useful when you play with toy examples in the REPL but isn't actually helpful when you try to write any serious code. This comes up particularly often in sympy because: 1. Many contributors strongly feel that A == B should "do the right thing" (confusing structural and mathematical equality) 2. Many calculations in sympy are cached and the cache can swap A and B if A == B. 3. There are a lot of algorithms that make heavy use of ==. The issues are the same elsewhere though: gratuitously making objects compare equal with == is a bad idea unless you are happy to substitute one for the other. Otherwise what is the purpose of having them compare equal in the first place? Oscar

On Wed, May 6, 2020 at 10:26 PM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Sure. But a great many things I calculate are not particularly exact. If I want the mean of about a hundred numbers that are each somewhere in the interval [1, 1e6], I'm probably not very interested in 1 ulp errors in 64-bit floating point. And when I *do* care about being exact, I can either cast the arguments to the appropriate type or raise an exception for the unexpected type. If my function deals with primes of thousands of digits, int is more appropriate. But maybe I want a Decimal of some specific precision. Or a Fraction. Or maybe I want to use gmpy as an external type for greater precision. If it's just `x = myfavoritetype(x)` as the first line of the function, that's easy to do.
Yeah, sometimes. But not nearly as much of an attractive nuisance as using `==` between to floating point numbers rather than math.isclose() or numpy.isclose(). My students trip over ` (0.1+0.2)+0.3 == 0.1+(0.2+0.3)` a lot more often than they trip over `1.0 == 1`. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Thu, May 7, 2020 at 12:26 PM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I most certainly DO care about accurate integer calculations, which is one of the reasons I'm very glad to have separate int and float types (ahem, ECMAScript, are you eavesdropping here?). In any situation where I would consider them equivalent, it's actually the float that I want (it's absolutely okay if I have to explicitly truncate a float to int if I want to use it in that context), so the only way they'd not be equivalent is if the number I'm trying to represent actually isn't representable. Having to explicitly say "n + 0.0" to force it to be a float isn't going to change that, so there's no reason to make that explicit. For the situations where things like fsum are important, it's great to be able to grab them. For situations where you have an integer number of seconds and want to say "delay this action by N seconds" and it wants a float? It should be fine accepting an integer.
Definitely not. I'm just arguing against your notion that equality should ONLY be between utterly equivalent things. It's far more useful to allow more things to be equal. ChrisA

On 7/05/20 1:07 pm, David Mertz wrote:
It's not much use for deciding whether two things *should* be equal, though, because whatever your opinion on the matter, you can come up with a set of functions that satisfy it and then say "those are the kinds of functions I mean". Also, as a definition of equality it seems somewhat circular, since if you're not sure whether x == y, you may be equally uncertain whether f(x) == f(y) for some f, x, y. -- Greg

On Thu, 7 May 2020 at 08:54, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
It's not so much a definition of equality as a consistency requirement. The contrapositive can be very clear: if you already know that f(x) and f(y) do different things or return unequal objects then the question of whether x == y is answered. It's important though that it's not just about equality of return types: when you carry the principle over from maths to programming then you need to consider non-pure functions, IO, exceptions being raised etc. In simple situations it is nice to be able to duck-type over lists and tuples but in practice it has to be done carefully by sticking to the sequence or iterable interfaces precisely or by coercing to a known type at the entry points of your code. Once you have a large codebase with lots of objects flying around internally and you no longer know whether anything is a list or a tuple (or a set...) any more it's just a mess. Oscar

On Sat, May 02, 2020 at 05:12:58AM -0000, Ahmed Amr wrote:
I'm going to throw out a wild idea (actually not that wild :-) that I'm sure people will hate for reasons I shall mention afterwards. Perhaps we ought to add a second "equals" operator? To avoid bikeshedding over syntax, I'm initially going to use the ancient 1960s Fortran syntax and spell it `.EQ.`. (For the avoidance of doubt, I know that syntax will not work in Python because it will be ambiguous. That's why I picked it -- it's syntax that we can all agree won't work, so we can concentrate on the semantics not the spelling.) We could define this .EQ. operate as *sequence equality*, defined very roughly as: def .EQ. (a, b): return len(a) == len(b) and all(x==y for x, y in zip(a, b)) (Aside: if we go down this track, this could be a justification for zip_strict to be a builtin; see the current thread(s) on having a version of zip which strictly requires its input to be equal length.) The precise details of the operator are not yet clear to me, for instance, should it support iterators or just Sized iterables? But at the very least, it would support the original request: [1, 2, 3] .EQ. (1, 2, 3) # returns True The obvious operator for this would be `===` but of course that will lead to an immediate and visceral reaction "Argghhh, no, Javascript, do not want!!!" :-) Another obvious operator would be a new keyword `eq` but that would break any code using that as a variable. But apart from the minor inconveniences that: - I don't know what this should do in detail, only vaguely; - and I have no idea what syntax it should have what do people think of this idea? -- Steven

This reminded me of another recent message so I decided to find that and link it here: https://mail.python.org/archives/list/python-ideas@python.org/message/7ILSYY... It seemed like a more useful thing to do before I discovered that you wrote that too... On Thu, May 7, 2020 at 11:20 AM Steven D'Aprano <steve@pearwood.info> wrote:

On Thu, 7 May 2020 19:11:43 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
Equality and its awkward cousin equivalence are slippery slopes. Just sequences? That (admittedly rough) function returns true for certain mapping arguments. What about case-insensitive string matching? Is that more common than comparing (or wanting to compare) arbitrary sequences? What about an operator for normalized (in the Unicode sense of the word), case-insensitive string comparison? (There's a precedent: Common Lisp's equalp function does case-insensitive string matching.) The strongest equality is the "is" operator, and then the == operator, and ISTM that you're now extending this idea to another class of equivalency. The very far ends of that scale are glossing over American vs. British spellings (are "color" and "colour" in some sense equal?), or even considering two functions "the same" if they produce the same outputs for the same inputs. One of Python's premises and strengths is strong typing; please don't start pecking away at that. Do beginners expect that [1, 2, 3] == (1, 2, 3)? No. Do experts expect that [1, 2, 3] == (1, 2, 3)? No. So who does? Programmers working on certain applications, or with multiple [pre-existing] libraries, or without a coherent design. These all seem like appliction level (or even design level) problems, or maybe a series of dunder methods / protocols to define various levels of equivalence (the ability of my inbox and my brain to handle the resulting bikeshedding notwithstanding). YMMV. Just my thoughts. -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan

On Thu, May 07, 2020 at 06:04:13AM -0400, Dan Sommers wrote:
*shrug* As I point out later in my post, I don't know whether it should be just sequences. Maybe it should be any iterable, although checking them for equality will necessarily consume them. But *right now* the proposal on the table is to support list==tuple comparisons, which this would do. (For an especially vague definition of "do" :-)
What about case-insensitive string matching?
That can be a string method, since it only needs to operate on strings.
The strongest equality is the "is" operator
Please don't encourage the conceptual error of thinking of `is` as *equality*, not even a kind of equality. It doesn't check for equality, it checks for *identity* and we know that there is at least one object in Python where identical objects aren't equal: py> from math import nan py> nan is nan True py> nan == nan False [...]
The very far ends of that scale are glossing over American vs. British spellings (are "color" and "colour" in some sense equal?),
YAGNI. The proposal here is quite simple and straightforward, there is no need to over-generalise it to the infinite variety of possible equivalencies than someone might want. People can write their own functions. It is only that wanting to compare two ordered containers for equality of their items without regard to the type of container is a reasonably common and useful thing to do. Even if we don't want list==tuple to return True -- and I don't! -- we surely can recognise that sometimes we don't care about the container's type, only it's elements. -- Steven

On Thu, 7 May 2020 21:18:16 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
We'd better agree to disagree on this one.
YAGNI is how I feel about an operator that compares sequences element by element. People can write their own functions. :-) Or add your .EQ. function to the standard library (or even to builtins, and no, I don't have a good name).
Do "reasonably common," "useful," and "sometimes" meet the bar for a new operator? (That's an honest question and not a sharp stick.) FWIW, I agree: list != tuple. When's the last time anyone asked for the next element of a tuple? (Okay, if your N-tuple represents a point in N-space, then you might iterate over the coordinates in order to discover a bounding box.) Dan -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan

On Thu, May 07, 2020 at 11:04:16AM -0400, Dan Sommers wrote:
Why? In what way is there any room for disagreement at all? This isn't a matter of subjective opinion, like what's the best Star Wars film or whether pineapple belongs on pizza. This is a matter of objective fact, like whether Python strings are Unicode or not. Whatever we might feel about equality and identity in the wider philosophical sense, in the *Python programming sense* the semantic meaning of the two operators are orthogonal: * some equal objects are not identical; * and some identical objects are not equal. It is a matter of fact that in Python `is` tests for object identity, not equality: https://docs.python.org/3/reference/expressions.html#is-not If you wish to agree with Bertrand Meyer that reflexivity of equality is one of the pillars of civilization: https://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civili... and therefore Python gets equality wrong, you are welcome to that opinion, but whether we like it or not equality in Python is not necessarily reflexive and as a consequence objects may be identical (i.e. the same object) but not equal. Float and Decimal NANs are the most obvious examples. You don't even need to look at such exotic objects as NANs to see that `is` does not test for equality. None of these will return True: [] is [] 1.5 is Fraction(3, 2) (a := {}) is a.copy() even though the operands are clearly equal. [...]
YAGNI is how I feel about an operator that compares sequences element by element.
Remember that list-to-list and tuple-to-tuple already perform the same sequence element-by-element comparison. All this proposal adds is *duck-typing* to the comparison, for when it doesn't matter what the container type is, you care only about the values in the container. Why be forced to do a possibly expensive (and maybe very expensive!) manual coercion to a common type just to check the values for equality element by element, and then throw away the coerced object? If you have ever written `a == list(b)` or similar, then You Already Needed It :-)
True, but there are distinct advantages to operators over functions for some operations. See Guido's essay: https://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html
It depends on how common and useful, and how easy it is to find a good operator. It might be a brilliant idea stymied by lack of a good operator. We might be forced to use a function because there are no good operators left any more, and nobody wants Python to turn into Perl or APL.
FWIW, I agree: list != tuple. When's the last time anyone asked for the next element of a tuple?
Any time you have written: for obj in (a, b, c): ... you are asking for the next element of a tuple. A sample from a test suite I just happen to have open at the moment: # self.isprime_functions is a tuple of functions to test for func in self.isprime_functions: for a in (3, 5, 6): self.assertFalse(sqrt_exists(a, 7)) for a in (2, 6, 7, 8, 10): self.assertFalse(sqrt_exists(a, 11)) -- Steven

On Fri, May 8, 2020 at 1:06 PM Steven D'Aprano <steve@pearwood.info> wrote:
You yourself introduced—speculatively—the idea of another equality operator, .EQ., that would be "equal in some sense not captured by '=='. I just posted another comment where I gave function names for six plausibly useful concepts of "equality" ... well, technically, equivalence-for-purpose. The distinction you make seems both pedantic and factually wrong. More flat-footed still is "equal objects are ones whose .__eq__() method returns something truthy." It doesn't actually need to define any of the behaviors we think of as equality/equivalence. I was going to write a silly example of e.g. throwing a random() into the operation, but I don't think I have to for the point to be obvious. Both '==' and 'is' are ways of saying equivalent-for-a-purpose. For that matter, so is math.isclose() or numpy.allclose(). Or those json-diff libraries someone just linked to. Given that different Python implementations will give different answers for 'some_int is some_other_int' where they are "equal" in an ordinary sense, identity isn't anything that special in most cases. Strings are likewise sometimes cached (but differently by version and implementation). The only cases where identity REALLY has semantics I would want to rely on are singletons like None and True, and I guess for custom mutable objects when you want to make sure which state is separated versus shared. Well, OK, I guess lists are an example of that already for the same reason. For non-singleton immutables, identity is not really a meaningful thing. I mean, other than in a debugger or code profiler, or something special like that. I honestly do not know whether, e.g. '(1, "a", 3.5) is (1, "a", 3.5)'. I'll go try it, but I won't be sure the answer for every implementation, version, and even runtime, whether that answer will be consistent. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

So I did try it. I did not necessarily expect these particular results. Moreover, I have a hunch that with PyPy JIT, something similar might actually give different answers at different points when the same line was encountered in a running interpreter. Not this example, but something else that might cache values only later. I haven't done anything sneaky with the version at those paths. They are all what the environment name hints they should be. PyPy is at 3.6, which is the latest version on conda-forge. 810-tmp % $HOME/miniconda3/envs/py2.7/bin/python -c 'print((1, "a", 3.5) is (1, "a", 3.5))' False 811-tmp % $HOME/miniconda3/envs/py3.4/bin/python -c 'print((1, "a", 3.5) is (1, "a", 3.5))' False 812-tmp % $HOME/miniconda3/envs/py3.8/bin/python -c 'print((1, "a", 3.5) is (1, "a", 3.5))' <string>:1: SyntaxWarning: "is" with a literal. Did you mean "=="? True 813-tmp % $HOME/miniconda3/envs/pypy/bin/python -c 'print((1, "a", 3.5) is (1, "a", 3.5))' True 814-tmp % $HOME/miniconda3/envs/py1/bin/python -c 'print (1, "a", 3.5) is (1, "a", 3.5)' 0 -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Sat, May 9, 2020 at 4:17 AM Alex Hall <alex.mojaki@gmail.com> wrote:
I think you're more seeing the module compilation optimizations here. Inside a single compilation unit (usually a module), constants will often be shared. So, for instance:
But if you do those lines individually at the REPL, you'll get False. Of course, a compliant Python interpreter is free to either collapse them or keep them separate, but this optimization helps to keep .pyc file sizes down, for instance. ChrisA

On Fri, May 08, 2020 at 01:26:05PM -0400, David Mertz wrote:
The distinction you make seems both pedantic and factually wrong.
Which distinction are you referring to? The one between `is` and `==`? And in what way is it factually wrong?
More flat-footed still is "equal objects are ones whose .__eq__() method returns something truthy."
Nevertheless, flat-footed or not, that is broadly the only meaning of equality that has any meaning in Python. Two objects are equal if, and only if, the `==` operator returns true when comparing them. That's what equality means in Python! (There are a few nuances and complexities to that when it comes to containers, which may short-cut equality tests with identity tests for speed.)
It doesn't actually need to define any of the behaviors we think of as equality/equivalence.
Indeed. Which is why we cannot require any of those behaviours for the concept of equality in Python.
Both '==' and 'is' are ways of saying equivalent-for-a-purpose.
`==` is the way to say "equal", where equal means whatever the class wants it to mean. If you want to describe that as "equivalent-for-a- purpose", okay. But `is` compares exactly and only "object identity", just as the docs say, just as the implementation, um, implements. That's not an equivalence, at least not in the plain English sense of the word, because an equivalence implies at least the possibility of *distinct* objects being equivalent: a is equivalent to b but a is not identical to b Otherwise why use the term "equivalent" when you actually mean "is the same object"? By definition you cannot have: a is identical to b but a is not identical to b so in this sense `is` is not a form of equivalence, it is just *is*. The mathematical sense of an equivalence relation is different: object identity certainly is an equivalence relation. [...]
Right. Remind me -- why are we talking about identity? Is it relevant to the proposal for a duck-typing container equals operator? [...]
So... only None, and True and False, and other singletons like NotImplemented, and custom mutable objects, and builtin mutable objects like list and dict and set, and typically for classes, functions and modules unless you're doing something weird. Okay.
For non-singleton immutables, identity is not really a meaningful thing.
It's of little practical use except to satisfy the caller's curiousity about implementation details. -- Steven

On Sat, 9 May 2020 03:01:15 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
I believe that the "is" operator is a test for some kind of equality, and you apparently don't.
Section 6.10 is entitled Comparisons, and lists both "is" and "==" as comparison operators. I admit that my use of the word "strongest" (to describe the "is" operator) and my conceptual ordering of different kinds of equality fails in the light of NaNs. Curse you, IEEE Floating Point! :-) Then again, that same documentation states "User-defined classes that customize their comparison behavior should follow some consistency rules, if possible." One of the consistency rules is "Equality comparison should be reflexive. In other words, identical objects should compare equal," and that rule is summarized as "x is y implies x == y." So I'm not the only one who thinks of "is" as a kind of equality. :-)
The OP wants [1, 2, 3] == (1, 2, 3) to return True, even though the operands are clearly not equal.
My mistake. I should have said "... compares arbitrary sequences of varying types ..." and not just "sequences."
Then I'll write a function that iterates over both sequences and compares the pairs of elements. There's no need to coerce one or both completes sequences.
If you have ever written `a == list(b)` or similar, then You Already Needed It :-)
I don't recall having written that. I do end up writing 'a == set(b)' when a is a set and b is a list, rather than building b as a set in the first place, but sets aren't sequences.
I have been known to write: for x in a, b, c: (without the parenthesis), usually in the REPL, but only because it's convenient and it works. In other programming languages that don't allow iteration over tuples, I use lists instead.

On Sat, May 9, 2020 at 4:43 AM Dan Sommers <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
The documentation doesn't say that "is" represents equality, but only that, in general, an object should be equal to itself. Identity is still a completely separate concept to equality. There's a concept of "container equality" that is expressed as "x is y or x == y", but that's still a form of equality check. "x is y" on its own is not an equality check. It's an identity check. Obviously it's a comparison, but so are many other things :) ChrisA

On 08.05.20 19:01, Steven D'Aprano wrote:
Initially I assumed that the reason for this new functionality was concerned with cases where the types of two objects are not precisely known and hence instead of converting them to a common type such as list, a direct elementwise comparison is preferable (that's probably uncommon though). Instead in the case where two objects are known to have different types but nevertheless need to be compared element-by-element, the performance argument makes sense of course. So as a practical step forward, what about providing a wrapper type which performs all operations elementwise on the operands. So for example: if all(elementwise(chars) == string): ... Here the `elementwise(chars) == string` part returns a generator which performs the `==` comparison element-by-element. This doesn't perform any length checks yet, so as a bonus one could add an `all` property: if elementwise(chars).all == string: ... This first checks the lengths of the operands and only then compares for equality. This wrapper type has the advantage that it can also be used with any other operator, not just equality. Here's a rough implementation of such a type: import functools import itertools import operator class elementwise: def __init__(self, obj, *, zip_func=zip): self.lhs = obj self.zip_func = zip_func def __eq__(self, other): return self.apply_op(other, op=operator.eq) def __lt__(self, other): return self.apply_op(other, op=operator.lt) ... # define other operators here def apply_op(self, other, *, op): return self.make_generator(other, op=op) def make_generator(self, other, *, op): return itertools.starmap(op, self.zip_func(self.lhs, other)) @property def all(self): zip_func = functools.partial(itertools.zip_longest, fillvalue=object()) return elementwise_all(self.lhs, zip_func=zip_func) class elementwise_all(elementwise): def apply_op(self, other, *, op): try: length_check = len(self.lhs) == len(other) except TypeError: length_check = True return length_check and all(self.make_generator(other, op=op))

On Sat, May 9, 2020 at 11:57 AM Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
Now `==` has returned an object that's always truthy, which is pretty dangerous.
This is now basically numpy. ``` In[14]: eq = numpy.array([1, 2, 3]) == [1, 2, 4] In[15]: eq Out[15]: array([ True, True, False]) In[16]: eq.all() Out[16]: False In[17]: eq.any() Out[17]: True In[18]: bool(eq) Traceback (most recent call last): ... ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() ``` I've used number instead of strings because numpy treats strings as units instead of iterables for this kind of purpose, so you'd have to do some extra wrapping in lists to explicitly ask for character comparisons.

On 09.05.20 12:18, Alex Hall wrote:
That can be resolved by returning a custom generator type which implements `def __bool__(self): raise TypeError('missing r.h.s. operand')`.
Actually I took some inspiration from Numpy but the advantage is of course not having to install Numpy. The thus provided functionality is only a very small subset of what Numpy provides.

On 09.05.20 14:16, Dominik Vilsmeier wrote:
After reading this again, I realized the error message is nonsensical in this context. It should be rather something like: `TypeError('The truth value of an elementwise comparison is ambiguous')` (again taking some inspiration from Numpy).

On May 9, 2020, at 02:58, Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
There’s an obvious use for the .all, but do you ever have a use for the elementwise itself? When do you need to iterate all the individual comparisons? (In numpy, an array of bools has all kinds of uses, starting with indexing or selecting with it, but I don’t think any of them are doable here.) And obviously this would be a lot simpler if it was just the all object rather than the elementwise object—and even a little simpler to use: element_compare(chars) == string (In fact, I think someone submitted effectively that under a different name for more-itertools and it was rejected because it seemed really useful but more-itertools didn’t seem like the right place for it. I have a similar “lexicompare” in my toolbox, but it has extra options that YAGNI. Anyway, even if I’m remembering right, you probably don’t need to dig up the more-itertools PR because it’s easy enough to redo from scratch.)

On 09.05.20 22:16, Andrew Barnert wrote: there's probably not much use for the elementwise iterator itself. So one could use `elementwise` as a namespace for `elementwise.all(chars) == string` and `elementwise.any(chars) == string` which automatically reduce the elementwise comparisons and the former also performs a length check prior to that. This would still leave the option of having `elementwise(x) == y` return an iterator without reducing (if desired).

On May 9, 2020, at 13:24, Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
But do you have any use for the .any? Again, it’s useful in NumPy, but would any of those uses translate? If you’re never going to use elementwise.any, and you’re never going to use elementwise itself, having elementwise.all rather than just making that the callable is just making the useful bit a little harder to access. And it’s definitely complicating the implementation, too. If you have a use for the other features, that may easily be worth it, but if you don’t, why bother? I took my lexicompare, stripped out the dependency on other helpers in my toolbox (which meant rewriting < in a way that might be a little slower; I haven’t tested) and the YAGNI stuff (like trying to be “view-ready” even though I never finished my views library), and posted it at https://github.com/abarnert/lexicompare (no promises that it’s stdlib-ready as-is, of course, but I think it’s at least a useful comparison point here). It’s pretty hard to beat this for simplicity: @total_ordering class _Smallest: def __lt__(self, other): return True @total_ordering class lexicompare: def __new__(cls, it): self = super(lexicompare, cls).__new__(cls) self.it = it return self def __eq__(self, other): return all(x==y for x,y in zip_longest(self.it, other, fillvalue=object())) def __lt__(self, other): for x, y in zip_longest(self.it, other, fillvalue=_Smallest()): if x < y: return True elif x < y: return False return False

On Thu, May 07, 2020 at 10:44:01PM +1200, Greg Ewing wrote:
Yes, but the *human readers* won't. You know that people will write things like: spam.EQ.ham and then nobody will know whether than means "call the .EQ. operator on operands spam and ham" or "lookup the ham attribute on the EQ attribute of spam" without looking up the parsing rules. Let's not turn into Ruby: https://lucumr.pocoo.org/2008/7/1/whitespace-sensitivity/ -- Steven

Why use "." which has clear syntax problems? This can already be done in current Python (this was linked to in a previous thread about something else) using a generic solution if you change the syntax: https://pypi.org/project/infix/ You could write it as |EQ|, ^EQ^, ... and have it in its own Pypi package. Not sure what IDEs think of this package, they probably hate it... On Thu, 7 May 2020 at 10:18, Steven D'Aprano <steve@pearwood.info> wrote:

On 07.05.20 11:11, Steven D'Aprano wrote:
But why do we even need a new operator when this simple function does the job (at least for sized iterables)? How common is it to compare two objects where you cannot determine whether one or the other is a tuple or a list already from the surrounding context? In the end these objects must come from somewhere and usually functions declare either list or tuple as their return type. Since for custom types you can already define `__eq__` this really comes down to the builtin types, among which the theoretical equality between tuple and list has been debated in much detail but is it used in practice?

On Thu, May 07, 2020 at 03:43:23PM +0200, Dominik Vilsmeier wrote:
Maybe it doesn't need to be an operator, but operators do have a big advantage over functions: http://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html On the other hand we only have a limited number of short symbols available in ASCII, and using words as operators reduces that benefit.
Never, because we can always determine whether something is a list or tuple by inspecting it with type() or isinstance(). But that's missing the point! I don't care and don't want to know if it is a tuple or list, I only care if it quacks like a sequence of some kind. The use-case for this is for when you want to compare elements without regard to the type of the container they are in. This is a duck-typing sequence element-by-element equality test. If you have ever written something like any of these: list(a) == list(b) tuple(a) == b ''.join(chars) == mystring all(x==y for x,y in zip(a, b)) then this proposed operator might be just what you need. -- Steven

On Fri, 8 May 2020 23:10:05 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 07, 2020 at 03:43:23PM +0200, Dominik Vilsmeier wrote:
To rephrase Dominik's question slighly, how often do you have a block of code with two sequences of unknown origin? Sure, I can *hypothisize* f(x, y) where x and y don't have to be anything more specific than sequences. But unless I'm actually writing .EQ., there's some code inside f that builds x or y, or calls some other function to obtain x or y, and then I know at least one of the types. You often ask for real world code that would be simpler or easier to read or maintain if such-and-such feature existed. The OP never posted any such thing; do you have any specific code in mind?
Ever? Maybe. Do I need help from a new operator or the standard library when it comes up? No, not really. And except for that join/mystring example, I find all of thse examples incredibly obvious and simple to read (although looking up a new function or operator isn't onerous, and I often learn things when that happens). -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan

On Fri, May 8, 2020 at 4:46 PM Henk-Jaap Wagenaar < wagenaarhenkjaap@gmail.com> wrote:
Steven mentioned that originally: We could define this .EQ. operate as *sequence equality*, defined very
But since you probably want these expressions to evaluate to false rather than raise an exception when the lengths are different, a strict zip is not appropriate.

Here's an example you might want to consider: >>> from collections import namedtuple >>> Point = namedtuple('Point', ['x', 'y']) >>> Point(1, 2) Point(x=1, y=2) >>> Point(1, 2) == (1, 2) True >>> Polar = namedtuple('Polar', ['r', 'theta']) >>> Polar(1, 2) Polar(r=1, theta=2) >>> Polar(1, 2) == (1, 2) True >>> Point(1, 2) == Polar(1, 2) True >>> hash(Point(1, 2)) == hash(Polar(1, 2)) == hash((1, 2)) True -- Jonathan

FYI, it does show in my version on gmail and on the mailman version. <https://mail.python.org/archives/list/python-ideas@python.org/message/WJKNLR...> BTW, I think strings do showcase some problems with this idea, .EQ. (as defined by Steven) is not recursive, which I think will be unworkable/unhelpful: ((0, 1), (1, 2)) and ([0, 1], [1, 2]) are not equal under the new operator (or new behaviour of == depending as per the OP) which I think goes completely against the idea in my book. If it were (replace x==y with x == y || x .EQ. y with appropriate error handling), strings would not work as expected (I would say), e.g.: [["f"], "o", "o"] .EQ. "foo" because a an element of a string is also a string. Worse though, I guess any equal length string that are not equal: "foo" .EQ. "bar" would crash as it would keep recursing (i.e. string would have to be special cased). What I do sometimes use/want (more often for casual coding/debugging, not real coding) is something that compares two objects created from JSON/can be made into JSON whether they are the same, sometimes wanting to ignore certain fields or tell you what the difference is. I do not think that could ever be an operator, but having a function that can help these kind of recursive comparisons would be great (I guess pytest uses/has such a function because it pretty nicely displays differences in sets, dictionaries and lists which are compared to each others in asserts). On Fri, 8 May 2020 at 16:23, Alex Hall <alex.mojaki@gmail.com> wrote:

On Fri, May 8, 2020 at 5:51 PM Henk-Jaap Wagenaar < wagenaarhenkjaap@gmail.com> wrote:
FYI, it does show in my version on gmail and on the mailman version. <https://mail.python.org/archives/list/python-ideas@python.org/message/WJKNLR...>
Weird, did Ethan's client cut it out?
If we redefined == so that `(0, 1) == [0, 1]`, then it would follow that `((0, 1), (1, 2)) == ([0, 1], [1, 2])`. Similarly if `(0, 1) .EQ. [0, 1]`, then it would follow that `((0, 1), (1, 2)) .EQ. ([0, 1], [1, 2])`.
Yes, strings would have to be special cased. In my opinion this is another sign that strings shouldn't be iterable, see the recent heated discussion at https://mail.python.org/archives/list/python-ideas@python.org/thread/WKEFHT4...
Something like https://github.com/fzumstein/jsondiff or https://pypi.org/project/json-diff/?

On 05/08/2020 09:36 AM, Alex Hall wrote:
On Fri, May 8, 2020 at 5:51 PM Henk-Jaap Wagenaar wrote:
FYI, it does show in my version on gmail and on the mailman version. <https://mail.python.org/archives/list/python-ideas@python.org/message/WJKNLR...>
Weird, did Ethan's client cut it out?
Ah, no. I thought you were replying to the code quote above the .EQ. one. The .EQ. quote was not white-space separated from the text around it and I missed it. -- ~Ethan~

All the discussion following Steven's hypothetical .EQ. operator (yes, not a possible spelling) just seems to drive home to me that what everyone wants is simply a function. Many different notions of "equivalence for a particular purpose" have been mentioned. We're not going to get a dozen different equality operators (even Lisp or Javascript don't go that far). But function names are plentiful. So just write your own: has_same_elements(a, b) case_insensitive_eq(a, b) same_json_representation(a, b) allclose(a, b) # A version of this is in NumPy recursively_equivalent(a, b) nan_ignoring_equality(a, b) And whatever others you like. All of these seem straightforwardly relevant to their particular use case (as do many others not listed). But none of them have a special enough status to co-opt the '==' operator or deserve their own special operator. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Fri, May 08, 2020 at 01:00:48PM -0400, David Mertz wrote:
All of which are red herrings that are completely off-topic for this proposal. This proposal has nothing to do with:
and the only reason I deleted the "recursively equivalent" one is because I don't know what it's supposed to mean. This proposal is a narrow one: its the same as list or tuple equality, but duck-typed so that the container type doesn't matter. Do lists and tuples do case-insensitive comparisons? No. Then neither does this proposal. Do lists and tuples do JSON-repr comparisons? No. Then neither does this. Do lists and tuples do numeric "within some epsilon" isclose comparisons (e.g. APL fuzzy equality)? Or ignore NANs? No to both of those. Then neither does this proposal. -- Steven

On Fri, May 08, 2020 at 07:52:10PM +0200, Alex Hall wrote:
Would the proposal come with a new magic dunder method which can be overridden, or would it be like `is`?
An excellent question! I don't think there needs to be a dunder. Calling this "sequence-equal": Two sequences are "sequence-equal" if: - they have the same length; - for each pair of corresponding elements, the two elements are either equal, or sequence-equal. The implementation may need to check for cycles (as ordinary equality does). It may also shortcut some equality tests by doing identity tests, as ordinary container equality does. -- Steven

On Fri, May 8, 2020 at 8:38 PM Steven D'Aprano <steve@pearwood.info> wrote:
The problem with this to me (and I think it's part of what David and others are saying) is that you're proposing additional syntax (for which there's usually a high bar) for the marginal benefit of improving a very specific use case. For comparison, the recent `@` operator is also intended for a very specific use case (matrix multiplication) but it can at least be reused for other purposes by overriding its dunder method. On top of that, we can see very clearly how the arguments in Guido's essay on operators applied to this case, with clear examples in https://www.python.org/dev/peps/pep-0465/#why-should-matrix-multiplication-b.... That doesn't apply so well to .EQ. as using `==` twice in a single expression isn't that common, and any specific flavour like .EQ. is even less common. `list(a) == list(b)` or `sequence_equal(a, b)` is suboptimal for visual mental processing, but it's still fine in most cases. I would be more supportive of some kind of 'roughly equals' proposal (maybe spelt `~=`) which could be overridden and did sequence equality, case insensitive string comparison, maybe approximate float comparison, etc. But even that has marginal benefit and I agree with the objections against it, particularly having 3 operators with similar equalish meanings. Perhaps a better alternative would be the ability to temporarily patch `==` with different meanings. For example, it could be nice to write in a test: with sequence_equals(): assert f(x, y) == f(y, x) == expected instead of: assert list(f(x, y)) == list(f(y, x)) == list(expected) or similarly with equals_ignoring_order(), equals_ignoring_case(), equals_ignoring_duplicates(), equals_to_decimal_places(2), equals_to_significant_figures(3), etc. This could be especially nice if it replaced implicit uses of `==` deeper in code. For example, we were recently discussing this function: ``` def zip_equal(*iterables): sentinel = object() for combo in zip_longest(*iterables, fillvalue=sentinel): if sentinel in combo: raise ValueError('Iterables have different lengths') yield combo ``` `sentinel in combo` is worrying because it uses `==`. For maximum safety we'd like to use `is`, but that's more verbose. What if we could write: ``` def zip_equal(*iterables): sentinel = object() with is_as_equals(): for combo in zip_longest(*iterables, fillvalue=sentinel): if sentinel in combo: raise ValueError('Iterables have different lengths') yield combo ``` and under the hood when `in` tries to use `==` that gets converted into `is` to make it safe? That's probably not the most compelling example, but I'm sure you can imagine ways in which `==` is used implicitly that could be useful to override. I'm not married to this idea, it's mostly just fun brainstorming.

On Sat, 9 May 2020 03:39:53 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
This proposal is a narrow one: its the same as list or tuple equality, but duck-typed so that the container type doesn't matter.
Okay. Good. "Container," however, is a dangerous word in this context. According to https://docs.python.org/3/library/stdtypes.html, lots of things are "conrtainers." Can they all be sequence-equal to each other? Of particular note might be sets, which don't have an inherent order. I am in no way proposing that sequence-equal be extended to cover sets, which by definition can't really be a sequence.

On Fri, May 08, 2020 at 03:12:10PM -0400, Dan Sommers wrote:
All(?) sequences are containers, but not all containers are sequences, so no.
This is a very good question, thank you. I think that this ought to exclude mappings and sets, at least initially. Better to err on the side of caution than to overshoot by adding too much and then being stuck with it. The primary use-case here is for sequences. Comparisons between sets and sequences are certainly possible, but one has to decide on a case-by-case basis what you mean. For example, are these equal? {1, 2} and (1, 1, 2) I don't know and I don't want to guess, so leave it out. -- Steven

On Fri, May 8, 2020 at 1:47 PM Steven D'Aprano <steve@pearwood.info> wrote:
I think you are trying very hard to miss the point. Yes... all of those functions that express a kind of equivalence are different from the OP proposal. But ALL OF THEM have just as much claim to being called equivalence as the proposal does. If we could only extend the '==' operator to include one other comparison, I would not choose the OP's suggestion over those others. Similarly, if '===' or '.EQ.' could only have one meaning, the OP proposal would not be what I would most want. Which is NOT, of course, to say that I don't think `containers_with_same_contents()` isn't a reasonable function. But it's just that, a function. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Fri, May 08, 2020 at 03:16:48PM -0400, David Mertz wrote:
So what? Why is this relevant? This is not a proposal for a generalised equivalence relation. If you want one of those feel free to propose a competing idea. (To be pedantic: at least the allclose() one is not an equivalence relation, as it would be possible to have isclose(a, b) and isclose(b, c) but not isclose(a, c). But that's a by-the-by.) Duck-typed sequence-equality requires no specialised equivalence relation. It's exactly the same as existing notions of container equality, except without the type-check on the container. It is a generic operation, not a specialised one like checking for fuzzy numeric close-enoughness, or JSON representations. If you want case insensitive string equality, propose a new string method.
Great! Start your own proposal in a new thread then and stop hijacking this one. -- Steven

On Fri, May 8, 2020, 6:39 PM Steven D'Aprano
The OP, with a certain degree of support from you, is asking for changing the meaning of an operator to enshrine one particular equivalence relation as syntactically blessed by the language. I note that that equivalence relation is not more important than numerous other equivalence relations, and hence should simply be a function returning a Boolean answer. I'm certain you understand this, I'm not sure why the facade otherwise. If you think that yes, that has_same_items() really is that much more important, present the case for that rather than irrelevant pedantics. (To be pedantic: at least the allclose() one is not an equivalence
Yes. And moreover, we can have: numpy.isclose(a, b) != numpy.isclose(b, a) The math module had a different approach that guarantees symmetry. Neither is a bug, they are just different. But '==' does not guarantee either symmetry or transitivity either. Not even among objects that intense to mean it in more-or-less the ordinary sense. Practicality beats purity. If .isclose() calls things equivalent, them for most purposes the calculation will be fine if you substitute. A strict mathematical equivalence relation is more... Well, strict. But in terms of what programmers usually care about, this is fluff. Fwiw, my proposal is "just write a simple function." I've made that proposal several times in this thread... But I don't think it's exactly PEP in nature.

On Fri, May 08, 2020 at 07:24:43PM -0400, David Mertz wrote:
https://mail.python.org/archives/list/python-ideas@python.org/message/IRIOEX... Ahmed is no longer asking for any change to the `==` operator. That's multiple dozens of emails out of date.
I note that that equivalence relation is not more important than numerous other equivalence relations
"More important" according to whose needs? I would agree with you that a string method to do case insensitive comparisons would be very useful. I would certainly use that instead of a.casefold() == b.casefold() especially if there was an opportunity to make it more efficient and avoid copying of two potentially very large strings. But why is that relevant? There is no conflict or competition between a new string method and new operator. We could have both! "Case insensitive string comparisons would be useful" is an argument for case insensitive string comparisons, it's not an argument against an unrelated proposal.
and hence should simply be a function returning a Boolean answer.
Sure, we can always write a function. But for something as fundamental as a type of equality, there is much to be said for an operator. That's why we have operators in the first place, including `==` itself, rather than using the functions from the operator module.
I'm certain you understand this, I'm not sure why the facade otherwise.
Façade, "a showy misrepresentation intended to conceal something unpleasant" (WordNet). Synonyms include deception, fakery, false front, fraud, imposture, insincerity, simulacrum, subterfuge, and trick. I'm sorry to hear that you are *certain* of my state of mind, and even sorrier that you believe I am lying, but I assure you, I truly do believe that these other equivalence relations are not relevant. And here is why: (1) They require a specialised equivalence relation apart from `==`. Such as math.isclose(), a case insensitive comparison, a JSON comparison. (2) As such they ought to go into their specialist namespaces: - case-insensitive string comparisons should be a string method, or at worst, a function in the string module; - a JSON-comparison probably should go into the json module; - fuzzy numeric equality should probably go into the math module (and that's precisely where isclose() currently exists). And hence they are not in competition with this proposal. (3) Whereas the proposed duck-typing sequence equality relies on the ordinary meaning of equality, applied element by element, ignoring the type of the containers. We can think of this as precisely the same as list equality, or tuple equality, minus the initial typecheck that both operands are lists. If you can understand list equality, you can understand this. You don't have to ask "what counts as close enough? what's JSON?" etc. It's just the regular sequence equality but with ducktyping on containers. It's competely general in a way that the other equivalences aren't. If you think that these other proposals are worth having, and are more useful, then *make the proposal* and see if you get interest from other people. You said that you would prefer to have a JSON-comparing comparison operator. If you use a lot of JSON, I guess that might be useful. Okay, make the case for that to be an operator! I'm listening. I might be convinced. You might get that operator in 3.10, and Python will be a better language. Just start a new, competing, proposal for it. But if you're not prepared to make that case, then don't use the existence of something you have no intention of ever asking for as a reason to deny something which others do want. "Python doesn't have this hammer, therefore you shouldn't get this screwdriver" is a non-sequitor and a lousy argument.
That's what I'm trying to do.
Is this intended as an argument for or against this proposal, or is it another "irrelevant pedantics" you just accused me of making? In any case, it is an exaggerated position to take. Among ints, or strings, or floats excluding NANs, `==` holds with all the usual properties we expect: * x == x for all ints, strings and floats excluding NANs; * if, and only if, x == y, then y == x; * and if x == y and y == z, then x == z. It's only Python equality is the *general* sense where the operands could be any arbitrary object that those properties do not necessarily hold. Since this proposal is for a simple duck-typed sequence version of ordinary Python equality, the same generalisation will apply: * If the sequences hold arbitrary objects, we cannot necessarily make any claims about the properties of sequence-equality; * But if you can guarantee that all of the objects are such that the usual properties apply to the `==` operator, then you can say the same about sequence-equality. In this regard, it is exactly the same as list or tuple equality, except it duck-types the container types. -- Steven

On Fri, May 8, 2020 at 11:39 PM Steven D'Aprano <steve@pearwood.info> wrote:
"More important" according to whose needs?
I dunno. To mine? To "beginner programmers"? To numeric computation? I can weaken my 'note' to 'purport' if that helps. (3) Whereas the proposed duck-typing sequence equality relies on
the ordinary meaning of equality, applied element by element, ignoring the type of the containers.
I think this one is our main disagreement. I think a meaning for "equality" in which a tuple is equal (equivalent) to a list with the same items inside it is strikingly different from the ordinary meaning of equality. I don't deny that it is sometimes a useful question to ask, but it is a new and different question than the one answered by '==' currently. In my mind, this new kind of equality is MORE DIFFERENT from the current meaning than would be case-folded equivalence of strings, for example.
Actually, this could perfectly well live on the types rather than in the modules. I mean, I could do it today by defining .__eq__() on some subclasses of strings, floats, dicts, etc. if I wanted to. But hypothetically (I'm not proposing this), we could also define new operators .__eq2__(), .__eq3__(), etc. that would be called when Python programmers used the operators `===`, `====`, etc. With these new operators in hand, we might give meanings to these new kinds of equivalence: (1, 2, 3) === [1, 2, 3] # has_same_items() "David" === "daviD" # a.upper() == b.upper() "David" ==== "dabit" # soundex(a, b) 3.14159265 === 3.14159266 # math.isclose(a, b) It's competely general in a way that the other equivalences aren't.
Umm... no, it's really not. It's a special kind of equivalence that I guess applies to the Sequence ABC. Or maybe the Collection ABC? But to be really useful, it probably needs to work with things that don't register those ABCs themselves. I would surly expect: (1, 2, 3) === np.array([1, 2, 3]) Also, if this were a thing. But what about dicts, which are now ordered, and hence sequence-like? Or dict.keys() if not the dict itself? I'm sure reasonable answers could be decided for questions like that, but this is FAR from "completely general" or a transparent extension of current equality. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Fri, May 08, 2020 at 04:51:04PM +0100, Henk-Jaap Wagenaar wrote:
The sample implementation I gave was explicitly described as "very roughly". In no way was it intended to be the reference implementation. It was intended to begin the process of deciding on the semantics, not end it.
(1) Ahmed has already accepted that changing `==` is not suitable, so please let us all stop beating that dead horse! `==` is not going to change. (2) The intention is for ((0, 1), [1, 2], 'ab') .EQ. [[0, 1], (1, 2), ['a', 'b']] to return True, as well as similar examples. On the other hand, this is not just a "flattening equality" operator, this would return False: ((0, 1), (1, 2)) .EQ. ((0,), (1, 2, 3)) since (0, 1) and (0,) have different lengths.
Why would that not work? * ["f"] .EQ. "f" is true since they both have length 1 and their zeroth elements are equal; * "o" .EQ. "o" is true; * "o" .EQ. "o" is still true the second time :-) * so the whole thing is true.
Yes. Is that a problem? As I already pointed out, it will also need to handle cycles. For example: a = [1, 2] a.append(a) b = (1, 2, [1, 2, a]) and I would expect that a .EQ. b should be True: len(a) == len(b) a[0] == b[0] # 1 == 1 a[1] == b[1] # 2 == 2 a[2] == b[2] # a == a so that's perfectly well-defined.
Feel free to propose that as a separate issue. -- Steven

On 07/05/2020 10:11, Steven D'Aprano wrote:
The biggest argument against a second "equals" operator, however it is spelt, is confusion. Which of these two operators do I want to use for this subtly different question of equality? Even where we have quite distinct concepts like "==" and "is", people still get muddled. If we have "==" and "=OMG=" or whatever, that would just be an accident waiting to happen. Cheers, Rhodri -- Rhodri James *-* Kynesim Ltd

On Thu, May 07, 2020 at 04:42:22PM +0100, Rhodri James wrote:
On 07/05/2020 10:11, Steven D'Aprano wrote:
I don't think so. The confusion with `is` is particularly acute for at least two reasons: - in regular English it can be a synonym for equals, as in "one and one is two, two and two is four"; - it seems to work sometimes: `1 + 1 is 2` will probably succeed. If the operator was named differently, we probably wouldn't have many people writing `1 + 1 idem 2` or `1 + 1 dasselbe 2` when they wanted equality. I doubt many people would be confused whether they wanted, let's say, the `==` operator or the `same_items` operator, especially if `1 + 1 same_items 2` raised a TypeError. -- Steven

You can get the desired behavior by casting a list to a tuple, or a tuple to a list, in the equality statement. That way those that rely on the existing implementation don't have to change their code. my_tup = (1, 2, 3) my_list = [1, 2, 3] print(list(my_tup) == my_list) On Sat, May 2, 2020, 9:04 AM Ahmed Amr <ahmedamron@gmail.com> wrote:

I see there are ways to compare them item-wise, I'm suggesting to bake that functionality inside the core implementation of such indexed structures. Also those solutions are direct with tuples and lists, but it wouldn't be as direct with arrays-lists/tuples comparisons for example. On Sat, 2 May 2020, 6:58 pm Antoine Rozo, <antoine.rozo@gmail.com> wrote:

Put this comparison in a function! The current behavior is what I wish '==' to do, and what millions of lines of Python code assume. A tuple is not a list is not an array. I don't want an equality comparison to lie to me. You can write a few lines to implement 'has_same_items(a, b)' that will behave the way you want. On Sat, May 2, 2020, 2:36 PM Ahmed Amr <ahmedamron@gmail.com> wrote:

On Sat, May 2, 2020 at 8:38 PM Ahmed Amr <ahmedamron@gmail.com> wrote:
I'm sure there are times when I would also like this, and others too. But it would be a disastrous break in backwards compatibility, which is why it has 0% chance of happening.
Also those solutions are direct with tuples and lists, but it wouldn't be as direct with arrays-lists/tuples comparisons for example.
It should be. If x and y are two sequences with the same length and the same values at the same indexes, then list(x) == list(y) follows very quickly.

On Sat, May 2, 2020 at 9:51 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
Would we? Is the contract `x == y => hash(x) == hash(y)` still required if hash(y) is an error? What situation involving dicts could lead to a bug if `(1, 2, 3) == [1, 2, 3]` but `hash((1, 2, 3))` is defined and `hash([1, 2, 3])` isn't? The closest example I can think of is that you might think you can do `{(1, 2, 3): 4}[[1, 2, 3]]`, but once you get `TypeError: unhashable type: 'list'` it'd be easy to fix.

It does look like that would violate a basic property of `==` -- if two values compare equal, they should be equally usable as dict keys. I can't think of any counterexamples. On Sat, May 2, 2020 at 1:33 PM Alex Hall <alex.mojaki@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Okay, that's fair. So the argument really comes down to backwards compatibility (which is inconvenient but important). On Sat, May 2, 2020 at 1:51 PM Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Sat, May 2, 2020 at 10:52 PM Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
Nice catch! That's really interesting. Is there reasoning behind `frozenset({1}) == {1}` but `[1] != (1,)`, or is it just an accident of history? Isn't a tuple essentially just a frozenlist? I know the intended semantics of tuples and lists tend to be different, but I'm not sure that's relevant.

On 2020-05-03 10:19 p.m., Steven D'Aprano wrote:
for what it's worth, I see myself using tuples as frozen lists more often than their "intended semantics". more specifically, you can't pass lists to: 1. isinstance 2. issubclass 3. str.endswith among others. so I sometimes just convert a list of strings into a tuple of strings and store it somewhere so I can use it with str.endswith later. (this is not how you're "supposed" to implement domain suffix blocks but w/e)

On Mon, May 4, 2020 at 11:43 AM Soni L. <fakedme+py@gmail.com> wrote:
That doesn't mean you're using a tuple as a frozen list - it means you're using a tuple as a static collection. I've never had a situation where I've wanted to use isinstance with a list that gets built progressively at run-time; it's always a prewritten collection. I don't see what this has to do with lists and tuples. You're using tuples the way they're meant to be used. ChrisA

Right. This isn't an accident. It is by design. Also, some numeric types are specifically designed for cross-type comparison: >>> int(3) == float(3) == complex(3, 0) True And in Python 2, by design, str and unicode were comparable: >>> u'abc' == 'abc' True But the general rule is that objects aren't cross-type comparable by default. We have to specifically enable that behavior when we think it universally makes sense. The modern trend is to avoid cross-type comparability, enumerates and data classes for example: >>> Furniture = Enum('Furniture', ('table', 'chair', 'couch')) >>> HTML = Enum('HTML', ('dl', 'ol', 'ul', 'table')) >>> Furniture.table == HTML.table False >>> A = make_dataclass('A', 'x') >>> B = make_dataclass('B', 'x') >>> A(10) == B(10) False Bytes and str are not comparable in Python 3: >>> b'abc' == 'abc' False
In terms of API, it might look that way. But in terms of use cases, they are less alike: lists-are-looping, tuples-are-for-nonhomongenous-fields. List are like database tables; tuples are like records in the database. Lists are like C arrays; tuples are like structs. On the balance, I think more harm than good would result from making sequence equality not depend on type. Also when needed, it isn't difficult to be explicit that you're converting to a common type to focus on contents: >>> s = bytes([10, 20, 30]) >>> t = (10, 20, 30) >>> list(s) == list(t) When you think about it, it makes sense that a user gets to choose whether equality is determined by contents or by contents and type. For some drinkers, a can of beer is equal to a bottle of bear; for some drinkers, they aren't equal at all ;-) Lastly, when it comes to containers. They each get to make their own rules about what is equal. Dicts compare on contents regardless of order, but OrderedDict requires that the order matches. Raymond

Raymond Hettinger wrote:
`(frozenset() == set()) is True` shocked me. According to wikipedia https://en.wikipedia.org/wiki/Equality_(mathematics): "equality is a relationship between two quantities or, more generally two mathematical expressions, asserting that the quantities have the same value, or that the expressions represent the same mathematical object." If lists and tuples are considered different "mathematical objects" (different types), they cannot be considered equal --tough they can be equivalent, for instance `([1, 2, 3] == list((1, 2, 3)) and tuple([1, 2, 3]) == (1, 2, 3)) is True`. I can only explain `(frozenset() == set()) is True` vs `(list() == tuple()) is False` if: a) `frozenset`s and `set`s are considered the same "mathematical objects". So immutability vs mutability is not a relevant feature in Python equality context. Then, `list() == tuple()` should be `True` if no other feature distinguishes lists from tuples, I suppose... b) language designers found `(frozenset() == set()) is True` convenient (why?). Then, why is not `(list() == tuple()) is True` so convenient? c) it is a bug and `frozenset() == set()` should be `True`.

On Tue, May 05, 2020 at 09:34:28AM -0000, jdveiga@gmail.com wrote:
`(frozenset() == set()) is True` shocked me.
According to wikipedia https://en.wikipedia.org/wiki/Equality_(mathematics): "equality is a relationship between two quantities or, more generally two mathematical expressions, asserting that the quantities have the same value, or that the expressions represent the same mathematical object."
There is no good correspondence between "mathematical objects" and types. Even in mathematics, it is not clear whether the integer 1 as the same mathematical object as the real number 1, or the complex number 1, or the quaternion 1. In Python, we usually say that if a type is part of the numeric tower ABC, then instances with the same numeric value should be considered equal even if they have different types. But that's not a hard rule, just a guideline. And it certainly shouldn't be used as a precedent implying that non-numeric values should behave the same way. If you are looking for a single overriding consistant principle for equality in Python, I think you are going to be disappointed. Python does not pretend to be a formal mathematically consistent language and the only principle for equality in Python is that equality means whatever the object's `__eq__` method wants it to mean.
List and tuple are distinguished by the most important feature of all: the designer's intent. Tuples are records or structs, not frozen lists, which is why they are called tuple not frozen list :-) even if people use them as a defacto frozen list. On the other hand, frozensets are frozen sets, which is why they compare equal. Does this make 100% perfectly logical sense? Probably not. But it doesn't have to. Lists and tuples are considered to be independent kinds of thing, while sets and frozensets are considered to be fundamentally the same kind of thing differentiated by mutability. (In hindsight, it might have been more logically clear if mutable sets inherited from immutable frozensets, but we missed the chance to do that.) -- Steven

Steven D'Aprano wrote:
Thanks for your reply. I do not expect any kind of full correspondence between mathematical objects and programming objects. Just reasoning by analogy and trying to understand how lists and tuples cannot be equal and frozensets and sets can be on similar grounds. Mostly asking than answering. Designers' intent is an admissible answer, of course. A cat and a dog can be equal if equality is defined as "having the same name". However, designers' intent is one thing, and users' understating is another one. From your words, I have learnt that --from designers' point of view-- tuples are different from lists in their nature while sets and frozensets are mostly the same kind of thing --roughly speaking of course... I wonder if users share that view. I feel that it is not unreasonable to expect that frozenset and set cannot be equal on the grounds that they are different types (as tuples and lists are different types too). From that perspective, equality on tuples / lists and frozensets / sets should follow similar rules. Not being that way is surprising. That is all. However, if sets and frozensets are "are considered to be fundamentally the same kind of thing differentiated by mutability", as you said, why not tuples and lists? And that is, I guess, the reasoning behind proponent's claim. What if the difference between tuples and lists is not so deep or relevant and they just differ on mutability? Asking again...

On 6/05/20 2:22 am, jdveiga@gmail.com wrote:
I think that can be answered by looking at the mathematical heritage of the types involved: Python Mathematics ------ ----------- set set frozenset set tuple tuple list sequence Sets and frozensets are both modelled after mathematical sets, so to me at least it's not surprising that they behave very similarly, and are interchangeable for many purposes. To a mathematician, however, tuples and sequences are very different things. Python treating tuples as sequences is a "practicality beats purity" kind of thing, not to be expected from a mathematical point of view. -- Greg

On Wed, 6 May 2020 at 01:41, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I don't think that is accurate to represent as a representation of "a mathematician". The top voted answer here disagrees: https://math.stackexchange.com/questions/122595/whats-the-difference-between... "A sequence requires each element to be of the same type. A tuple can have elements with different types." The common usage for both is: you have a tuple of (Z, +) representing the Abelian group of addition (+) on the integers (Z), whereas you have the sequence {1/n}_{n \in N} converging to 0 in the space Q^N (rational infinite sequences) for example. I'd say the difference is just one of semantics and as a mathematician I would consider tuples and sequences as "isomorphic", in fact, the set-theoretical construction of tuples as functions is *identical* to the usual definition of sequences: i.e. they are just two interpretations of the the same object depending on your point of view.

On Wed, May 06, 2020 at 02:58:01AM +0100, Henk-Jaap Wagenaar wrote:
Are you saying that you can't have a sequence that alternates between ints and rationals, say, or ints and surds (reals)? The sequence A_n = sqrt(n) from n=0 starts off int, int, real, ... so there is that. For what its worth, Wolfram Mathworld disagrees with both Greg's comment and the stackexchange answer, stating that a tuple is just a synonym for a list, and that both lists and sequences are ordered sets: https://mathworld.wolfram.com/n-Tuple.html https://mathworld.wolfram.com/List.html https://mathworld.wolfram.com/Sequence.html
One can come up with many other usages. I think a far more common use for tuples are the ordered pairs used for coordinates: (1, 2) So although tuples are ordered sets, and sequences are ordered sets, the way they are used is very different. One would not call the coordinate (1, 2) a sequence 1 followed by 2, and one would not normally consider a sequence such as [0, 2, 4, 6, 8, ...] to be a tuple. In normal use, a tuple is considered to be an atomic[1] object (e.g. a point in space), while a sequence is, in a sense, a kind of iterative process that has been reified.
I'd say the difference is just one of semantics
The difference between any two things is always one of semantics.
Many things are isomorphic. "Prime numbers greater than a googolplex" are isomorphic to the partial sums of the sequence 1/2 − 1/4 + 1/8 − 1/16 + ⋯ = 1/3 but that doesn't mean you could use 1/2 * 1/4 as your RSA public key :-) [1] I used that term intentionally, since we know that if you hit an atom hard enough, it ceases to be indivisible and can split apart :-) -- Steven

On Wed, May 06, 2020 at 07:15:22PM +1200, Greg Ewing wrote:
Oh I'm not agreeing with them, I'm just pointing out that the people who hang around math.stackexchange and the people who write for Mathworld don't agree. It is difficult to capture all the nuances of common usage in a short definition. Based purely on dictionary definitions, 'The Strolling Useless' is precisely the same meaning as 'The Walking Dead' but no native English speaker would confuse the two, and I'm pretty sure that few mathematicians would call the origin of the Cartesian Plane "a sequence" even if it does meet the definition perfectly :-) -- Steven

TL;DR: the maths does not matter. Programming language (design)/computer science/data structures should lead this discussion! Also, -1 on this proposal, -1000 on having it apply to strings. Feel free to read on if you want to hear some ramblings of somebody who does not get to use their academic knowledge of maths enough seeing an opportunity... On Wed, 6 May 2020 at 07:18, Steven D'Aprano <steve@pearwood.info> wrote:
That's a sequence in the reals (or algebraics or some other set that contains square roots), of which a subsequence also happens to live in the integers. A square is still a rectangle.
These two above pertain to data structures in computer science, not mathematics. An "ordered set" is not a mathematical term I have every come across, but if it is, it means exactly as how they define a sequence (though you would have to extend it to infinite sequences to allow infinite ordered sets):
The notation ([image: a_1], [image: a_2], ..., [image: a_n]) is the same as saying it is a sequence in some set X^n (if not given an X, setting X = {a_1, ..., a_n} works, is that cheating? Yes. Is that a problem in set-theoretic mathematics? Not in this case anyway)
I would call that an ordered pair, or, a sequence of length 2.
I would not use the word "tuple", in my experience, tuple in mathematics (not computer science!) is only used in the way I described it: to gather up the information about a structure into one object, so that we can say it exists: because existing means some kind of set exists, and so we need to somehow codify for e.g. addition on the integers both the addition and the integers, i.e. combining two wholly different things into one 2-sequence: (Z, +). Note that such structures might require infinite tuples, e.g. if they support infinitely many operators. Anyway, this is where the StackOverflow answer comes from: tuples are used in parlance for sequences are in the same "space" for their coordinates, sequences for things that have all coordinates in the same "space".
You can construct a sequence (or tuple) iteratively, but whether you do or not has no bearing on the end result. Also, tuples are very much not atomic in the mathematical sense. I would also like to note when you say "a tuple is considered to be an atomic[1] object (e.g. a point in space)", then to a mathematician, A_n = 1/sqrt(n) for n = 0, 1, ... is simply a point in space too: just the space of sequences over the reals. Mathematicians (generally, in the field of foundational logic it is a tad different) don't tend to be concerned with differences such as how you define an object (just need to make sure it exists), whether things are finite or infinite, specified or unspecified. Unfortunately, in real life, in a programming language, we do have to care about these things.
But in this case, it does mean, that claiming that the mathematical point of view is that tuples and lists are different due to them being based on tuples and sequences in mathematics is flawed. An alpha-tuple is the same as an alpha-sequence in mathematics and is an element of X^alpha for some X. That's not a random isomorphism, that's a canonical one, which is a big difference with your isomorphism. I strongly oppose any idea that mathematics supports distinction between tuples and lists, however my main point is that it does not even matter. Python should not be led by mathematical conventions/structures in this decision or in others, and it has not in the past: I can have a set {0} in Python and another {0} and they will be different, in mathematics that makes no sense: those two sets are the same, similarly, Programming languages are not mathematics and mathematics is not a programming language and when it comes to comparing things as equal, using it as an example is a losing battle. It should instead look towards computer science, other programming languages and actual use cases to determine what is best. On a personal note, I prefer the current behaviour: e.g. tuples will not successively compare with a list ever not because it is consistent with any mathematical property, but because I think it is a good design for Python/a programming language. On Wed, 6 May 2020 at 08:05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I would genuinely be interested to be linked some examples of tuples being used in a different way in mathematics, not computer science (e.g. the links by Stephen were in computer science data structures)
So... say you are solving a problem in 1d, you do that on a real number x, right? Now you solve it in 2d, so you do your work on a pair (x, y), then you might solve it in 3d and do your work on a triplet (x, y, z). A few days later you generalize it to n-dimensions and you get a *sequence* (I would not use the word tuple here, nor have I ever seen it used, though I do not dispute some might) (x_1, ..., x_n) that you work on. Then, a few days later you generalize it to infinite sequences (x_1, x_2, ...). Surely we are done now? No way, we can do that again, so we can make it instead a doubly infinite sequence (x_1, x_2, ..., x_omega, x_(omega + 1), ..., x_(2 * omega)) and so on (we are now inducting over the ordinals slowly but surely, see https://en.wikipedia.org/wiki/Ordinal_number#Transfinite_sequence) Anyway, my point is, to me, all those objects, from x, to (x, y, z), all the way to (x_1, x_2, ..., x_omega, x_(omega + 1), ..., x_(2 * omega)) look the same to me: they are all {x_n}_{n in I} for different index sets I and all with coordinates in X = R, the real numbers. To a programming language/programmer/computer science these are all quite different, require different implementations, restrictions and idea Make no difference to a mathematician how long it is, or even whether "long" makes sense (e.g. you can interpret a function f: R -> R as a sequence {f(x)}_{x in R} instead, which, sometimes might be more insightful than the other interpretation).
Labelling the elements of a tuple? Again, in my view programming/computer science speak coming in again: that does not really make sense when it comes to mathematics. Once again, my point is that mathematics does not matter: the data structure that is a tuple in Python does not correspond to a tuple in mathematics, and that does not even make sense, because mathematics does not deal with data structures. I think also, in the case/example you are describing, you really want a namedtuple (as I would want to describe structures in mathematics too), not a plain tuple, if the order is not what is important.

On 6/05/20 7:45 pm, Henk-Jaap Wagenaar wrote:
At this point I would say that you haven't created an infinite tuple, you've created an infinite sequence of finite tuples.
Then, a few days later you generalize it to infinite sequences (x_1, x_2, ...).
Now here I would stop and say, wait a minute, what does this proof look like? I'm willing to bet it involves things that assume some kind of intrinsic order to the elements of this "tuple". If it does, and it's an extension to the finite dimensional cases, then I would say you were really dealing with sequences, not tuples, right from the beginning. Now I must admit I was a bit hesitant about writing that statement, because in quantum theory, for example, one often deals with vector spaces having infinitely many dimensions. You could consider an element of such a space as being an infinite tuple. However, to even talk about such an object, you need to be able to write formulas involving the "nth element", and those formulas will necessarily depend on the numerical value of n. This gives the elements an intrinsic order, and they will have relationships to each other that depend on that order. This makes the object more like a sequence than a tuple. Contrast this with, for example, a tuple (x, y, z) representing coordinates in a geometrical space. There is no inherent sense in which the x coordinate comes "before" the y coordinate; that's just an accident of the order we chose to write them down in. We could have chosen any other order, and as long as we were consistent about it, everything would still work. This, I think, is the essence of the distinction between tuples and sequences in mathematics. Elements of sequences have an inherent order, whereas elements of a tuple have at best an arbitrarily-imposed order. -- Greg

Greg Ewing wrote:
However, in Python, tuples and lists are both sequences, ordered sets of elements. So it is not completely unreasoned to see them as Ahmed Amr is proposing: that is, so similar types that you can expect that if they have the same element, they are equal. (Like frozensets and sets in the "set type" domain). Indeed, tuples and lists are equivalent in Python: `(list() == list(tuple()) and tuple(list()) == tuple()) is True`. Do not misunderstand me. I agree with the idea that tuples and lists are different by design while frozenset and sets are not (as Steven D'Aprano pointed out in a previous posts). But considering tuples and lists as just ordered sets of elements and based their equality on their elements, not in their type, is an appealing idea. I think that some Pythonists would not disagree. A different thing is the practicality of this.

On 6/05/20 1:58 pm, Henk-Jaap Wagenaar wrote:
Maybe the small subset of mathematicians that concern themselves with trying to define everything in terms of sets, but I don't think the majority of mathematicians think like that in their everyday work. It's certainly at odds with the way I see tuples and sequences being used in mathematics. As well as the same type vs. different types thing, here are some other salient differences: - Infinite sequences make sense, infinite tuples not so much. - Sequences are fundamentally ordered, whereas tuples are not ordered in the same sense. Any apparent ordering in a tuple is an artifact of the way we conventionally write them. If we were in the habit of labelling the elements of a tuple and writing things like (x:1, y:2, z:3) then we wouldn't have to write them in any particular order -- (y:2, x:1, z:3) would be the same tuple. -- Greg

On 5/6/20 3:04 AM, Greg Ewing wrote:
In my mind, tuples and lists seem very different concepts, that just happen to work similarly at a low level (and because of that, are sometimes 'misused' as each other because it happens to 'work'). To me, tuples are things when the position of the thing very much matters, you understand the meaning of the Nth element of a tuple because it IS the Nth element of the tuple. It isn't so important that the Nth is after the (N-1)th element, so we could define our universe of tuples in a different order then it might still make sense, but we then need to reorder ALL the tuples of that type. A coordinate makes a great example of a tuple, we think of the 1st element of the coordinate as 'X' due to convention, and in the tuple it gets in meaning from its position in the tuple. A list on the other hand is generally not thought of in that way. A list might not be ordered, or it might be, and maybe there is SOME value in knowing that an item is the Nth on the list, but if it is an ordered list, it is generally more meaningful to think of the Nth item in relation to the (N-1)th and (N+1)th items. Adding an element to a tuple generally doesn't make sense (unless it is transforming it to a new type of tuple, like from 2d to 3d), but generally adding an item to a list does. This makes their concepts very different. Yes, you might 'freeze' a list by making it a tuple so it becomes hashable, but then you are really thinking of it as a 'frozen list' not really a tuple. And there may be times you make a mutable tuple by using a list, but then you are thinking of it as a mutable tuple, not a list. And these are exceptional cases, not the norm. -- Richard Damon

On May 6, 2020, at 05:22, Richard Damon <Richard@damon-family.org> wrote:
I think this thread has gotten off track, and this is really the key issue here. If someone wants this proposal, it’s because they believe it’s _not_ a misuse to use a tuple as a frozen list (or a list as a mutable tuple). If someone doesn’t want this proposal, the most likely reason (although admittedly there are others) is because they believe it _is_ a misuse to use a tuple as a frozen list. It’s not always a misuse; it’s sometimes perfectly idiomatic to use a tuple as an immutable hashable sequence. It doesn’t just happen to 'work', it works, for principled reasons (tuple is a Sequence), and this is a good thing.[1] It’s just that it’s _also_ common (probably a lot more common, but even that isn’t necessary) to use it as an anonymous struct. So, the OP is right that (1,2,3)==[1,2,3] would sometimes be handy, the opponents are right that it would often be misleading, and the question isn’t which one is right, it’s just how often is often. And the answer is obviously: often enough that it can’t be ignored. And that’s all that matters here. And that’s why tuple is different from frozenset. Very few uses of frozenset are as something other than a frozen set, so it’s almost never misleading that frozensets equal sets; plenty of tuples aren’t frozen lists, so it would often be misleading if tuples equaled lists. —- [1] If anyone still wants to argue that using a tuple as a hashable sequence instead of an anonymous struct is wrong, how would you change this excerpt of code: memomean = memoize(mean, key=tuple) def player_stats(player): # … … = memomean(player.scores) … # … Player.scores is a list of ints, and a new one is appended after each match, so a list is clearly the right thing. But you can’t use a list as a cache key. You need a hashable sequence of the same values. And the way to spell that in Python is tuple. And that’s not a design flaw in Python, it’s a feature. (Shimmer is a floor wax _and_ a dessert topping!) Sure, when you see a tuple, the default first guess is that it’s an anonymous struct—but when it isn’t, it’s usually so obvious from context that you don’t even have to think about it. It’s confusing a lot less often than, say, str, and it’s helpful a lot more often.

On Fri, 8 May 2020 17:40:31 -0700 Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
That's a good summary. Thank you. :-)
Very clever. Then again, it wouldn't be python-ideas if it were that simple! "hashable sequence of the same values" is too strict. I think all memoize needs is a key function such that if x != y, then key(x) != key(y). def key(scores): ','.join(str(-score * 42) for score in scores) memomean = memoize(mean, key=key) def player_stats(player): # … … = memomean(player.scores) … # … Oh, wait, even that's too strict. All memoize really needs is if mean(x) != mean(y), then key(x) != key(y): memomean = memoize(mean, key=mean) def player_stats(player): # … … = memomean(player.scores) … # … But we won't go there. ;-)

On May 8, 2020, at 20:36, Dan Sommers <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
I don’t think it’s particularly clever. And that’s fine—using common idioms usually is one of the least clever ways to do something out of the infinite number of possible ways. Because being intuitively the one obvious way tends to be important to becoming an idiom, and it tends to run counter to being clever. (Being concise, using well-tested code, and being efficient are also often important, but being clever doesn’t automatically give you any of those.)
Well, it does have to be hashable. (Unless you’re proposing to also replace the dict with an alist or something?) I suppose it only needs to be a hashable _encoding_ of a sequence of the same values, but surely the simplest encoding of a sequence is the sequence itself, so, unless “hashable sequence” is impossible (which it obviously isn’t), who cares?
def key(scores): ','.join(str(-score * 42) for score in scores)
This is still a sequence. If you really want to get clever, why not: def key(scores): return sum(prime**score for prime, score in zip(calcprimes(), scores)) But this just demonstrates why you don’t really want to get clever. It’s more code to write, read, and debug than tuple, easier to get wrong, harder to understand, and almost certainly slower, and the only advantage is that it deliberately avoids meeting a requirement that we technically didn’t need but got for free.
Well, it seems pretty unlikely that calculating the mean to use it as a cache key will be more efficient than just calculating the mean, but hey, if you’ve got benchmarks, benchmarks always win. :) (In fact, I predicted that memoizing here would be a waste of time in the first place, because the only players likely to have equal score lists to earlier players would be the ones with really short lists—but someone wanted to try it anyway, and he was able to show that it did speed up the script on our test data set by something like 10%. Not nearly as much as he’d hoped, but still enough that it was hard to argue against keeping it.)

Thanks Andrew for the excellent analysis quoted below. Further comments interleaved with yours. On Fri, May 08, 2020 at 05:40:31PM -0700, Andrew Barnert via Python-ideas wrote:
I don't think it is necessary to believe that it is *always* misuse, but only that it is *often* misuse and therefore `==` ought to take the conservative position and refuse to guess. I expect that nearly every Python programmer of sufficient experience has used a tuple as a de facto "frozen list" because it works and practicality beats purity. But that doesn't mean that I want my namedtuple PlayerStats(STR=10, DEX=12, INT=13, CON=9, WIS=8, CHR=12) to compare equal to my list [10, 12, 13, 9, 8, 12] by default.
Yes, I think there's a genuine need here.
-- Steven

On Tue, May 5, 2020 at 7:36 AM Raymond Hettinger < raymond.hettinger@gmail.com> wrote:
Right, that's what I'm referring to. If you're comparing two things which are meant to represent completely different entities (say, comparing a record to a table) then your code is probably completely broken (why would you be doing that?) and having equality return False isn't going to fix that. Conversely I can't see how returning True could break a program that would work correctly otherwise. If you're comparing a list and a tuple, and you haven't completely screwed up, you probably mean to compare the elements and you made a small mistake, e.g. you used the wrong brackets, or you forgot that *args produces a tuple.

On Sat, May 2, 2020 at 10:36 PM Guido van Rossum <guido@python.org> wrote:
It does look like that would violate a basic property of `==` -- if two values compare equal, they should be equally usable as dict keys.
It's certainly a reasonable property, but I don't think it's critical. By comparison, if it was the case that `(1, 2, 3) == [1, 2, 3]` and `hash((1, 2, 3)) != hash([1, 2, 3])` were both True without raising exceptions, that would be a disaster and lead to awful bugs. The equality/hash contract is meant to protect against that.
I can't think of any counterexamples.
I think it's reasonable that this change would introduce counterexamples where none previously existed, as we would be changing the meaning of ==. Although since writing this Dominik gave the frozenset example. I also think it'd be possible to have a data model where `{(1, 2, 3): 4}[[1, 2, 3]]` does work. You'd need a way to calculate a hash if you promised to use it only for `__getitem__`, not `__setitem__`, so you can't store list keys but you can access with them. (this is all just fun theoretical discussion, I'm still not supporting the proposal)

02.05.20 23:32, Alex Hall пише:
You are probably right. Here is other example: if make all sequences comparable by content, we would need to make `('a', 'b', 'c') == 'abc'` and `hash(('a', 'b', 'c')) == hash('abc')`. It may be deifficult to get the latter taking into account hash randomization.

Thanks, I do appreciate all the discussion here about that. Initially, I was thinking about having lists/arrays/tuples match the behavior of other instances in python that compare across their types like: 1) Sets (instances of set or frozenset) can be compared within and across their types As Dominic mentioned. 2) Numeric types do compare across their types along with fractions.Fraction and decimal.Decimal. 3) Binary Sequences( instances of bytes or bytearray) can be compared within and across their types (All points above stated in python reference in https://docs.python.org/3/reference/expressions.html) but after the discussion here, I think backword compatibility dominates for sure against that, Thanks!

On 5/3/20 8:40 AM, Ahmed Amr wrote:
I think the issue is that the set/frozen set distinction (and bytes/bytes array) is a much finer distinction than between arbitrary sequence types, as it is primarily just a change of mutability (and hash-ability), and all the Numeric types are really just slight different abstractions of the same basic set of values (or subsets thereof). The various containers don't have the same concept that they are essentially representing the same 'thing' with just a change in representation to control the types sort of numbers they can express and what sort of numeric errors the might contain (so two representations that map to the same abstract number make sense to be equal) Different types of sequences are more different in what they likely represent, so it is less natural for different sequences of the same value to be thought of as always being 'the same' There may be enough cases where that equality is reasonable, that having a 'standard' function to perform that comparison might make sense, it just isn't likely to be spelled ==. There are several questions on how to do thing that might need to be explored, Should the ignoring of sequence type be recurcively ignored or not, i.e. is [1, [2, 3]] the same as (1, (2, 3)) or not, and are strings just another sequence type, or something more fundamental. This doesn't make it a 'bad' idea, just a bit more complicated and in need of exploration. -- Richard Damon

On Sat, 2 May 2020 at 20:50, Serhiy Storchaka <storchaka@gmail.com> wrote:
This is the key point. Much of the other discussion in this thread seems to be bogged down in the mathematical interpretation of tuples and sequences but if I was to take something from maths here it would be the substitution principle of equality: https://en.wikipedia.org/wiki/Equality_(mathematics)#Basic_properties What the substitution principle essentially says is if x == y then f(x) == f(y) for any function f such that f(x) is well defined. What that means is that I should be able to substitute x for y in any context where x would work without any change of behaviour. We don't need to do any deep maths to see how that principle can be applied in Python but if you try to follow it rigorously then you'll see that there are already counterexamples in the language for example
Given a list x and a tuple y with equivalent elements x and y will not be interchangeable because one is not hashable and the other is not mutable so there are functions where one is usable but the other is not. Following the same reasoning set/frozenset should not compare equal. In SymPy there are many different mathematical objects that people feel should (on mathematical grounds) compare "equal". This happens enough that there is a section explaining this in the tutorial: https://docs.sympy.org/latest/tutorial/gotchas.html#equals-signs The terms "structural equality" and "mathematical equality" are used to distinguish the different kinds of equality with == being used for the structural sense. For example the sympy expression Pow(2, 2, evaluate=False) gives an object that looks like 2**2. This does mathematically represent the number 4 but the expression itself is not literally the number 4 so the two expressions are mathematically equal but not structurally equal:
This distinction is important because at the programmatic level p and 4 are not interchangeable. For example p being a Pow has attributes base and exp that 4 will not have. In sympy most objects are immutable and hashable and are heavily used in sets and dicts. Following the substitution principle matters not least because Python has baked the use of ==/__eq__ into low-level data structures so objects that compare equal with == will literally be interchanged:
All the same many sympy contributors have felt the need to define __eq__ methods that will make objects of different types compare equal and there are still examples in the sympy codebase. These __eq__ methods *always* lead to bugs down the line though (just a matter of time). I've come to the seemingly obvious conclusion that if there is *any* difference between x and y then it's always better to say that x != y. Oscar

On Thu, May 7, 2020 at 10:33 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I've come to the seemingly obvious conclusion that if there is *any* difference between x and y then it's always better to say that x != y.
And having worked in languages where floats and integers are fundamentally different beasts, I disagree: it is extremely practical (even if not pure) to have them compare equal. SourcePawn (the language generally used for modding games like Counter-Strike) is strongly-typed and does not allow floats and ints to be used interchangeably - except that you can do arithmetic and they'll be type-folded. So if you have a function TakeDamage that expects a floating-point amount of damage, and another function GetHealth that returns the player's health as an integer, you have to add 0.0 to the integer before it can be used as a float. Actual line of code from one of my mods: SDKHooks_TakeDamage(client, inflictor, attacker, GetClientHealth(client) + 0.0, 0, weapon); Every language has to choose where it lands on the spectrum of "weak typing" (everything can be converted implicitly) to "strong typing" (explicit conversions only), and quite frankly, both extremes are generally unusable. Python tends toward the stricter side, but with an idea of "type" that is at times abstract (eg "iterable" which can cover a wide variety of concrete types); and one of those very important flexibilities is that numbers that represent the same value can be used broadly interchangeably. This is a very good thing. ChrisA

I'm afraid, Oscar, that you seem to have painted yourself into a reductio ad absurdum. We need a healthy dose of "practicality beats purity" thrown in here. What the substitution principle essentially says is
I'm very happy to agree that "but id() isn't the kind of function I meant!" That's the point though. For *most* functions, the substitution principle is fine in Python. A whole lot of the time, numeric functions can take either an int or a float that are equal to each other and produce results that are equal to each other. Yes, I can write something that will sometimes overflow for floats but not ints. Yes, I can write something where a rounding error will pop up differently between the types. But generally, numeric functions are "mostly the same most of the time" with float vs. int arguments. This doesn't say whether tuple is as similar to list as frozenset is to set. But the answer to that isn't going to be answered by examples constructed to deliberately obtain (non-)substitutability for the sake of argument. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Thu, 7 May 2020 at 02:07, David Mertz <mertz@gnosis.cx> wrote:
That's the point though. For *most* functions, the substitution principle is fine in Python. A whole lot of the time, numeric functions can take either an int or a float that are equal to each other and produce results that are equal to each other. Yes, I can write something that will sometimes overflow for floats but not ints. Yes, I can write something where a rounding error will pop up differently between the types. But generally, numeric functions are "mostly the same most of the time" with float vs. int arguments.
The question is whether you (or Chris) care about calculating things accurately with floats or ints. If you do try to write careful code that calculates things for one or the other you'll realise that there is no way to duck-type anything nontrivial because the algorithms for exact vs inexact or bounded vs unbounded arithmetic are very different (e.g. sum vs fsum). If you are not so concerned about that then you might say that 1 and 1.0 are "acceptably interchangeable". Please understand though that I am not proposing that 1==1.0 should be changed. It is supposed to be a simple example of the knock on effect of defining __eq__ between non-equivalent objects.
This doesn't say whether tuple is as similar to list as frozenset is to set. But the answer to that isn't going to be answered by examples constructed to deliberately obtain (non-)substitutability for the sake of argument.
Those examples are not for the sake of argument: they are simple illustrations. I have fixed enough real examples of bugs relating to this to come to the conclusion that making non-interchangeable objects compare equal with == is an attractive nuisance. It seems useful when you play with toy examples in the REPL but isn't actually helpful when you try to write any serious code. This comes up particularly often in sympy because: 1. Many contributors strongly feel that A == B should "do the right thing" (confusing structural and mathematical equality) 2. Many calculations in sympy are cached and the cache can swap A and B if A == B. 3. There are a lot of algorithms that make heavy use of ==. The issues are the same elsewhere though: gratuitously making objects compare equal with == is a bad idea unless you are happy to substitute one for the other. Otherwise what is the purpose of having them compare equal in the first place? Oscar

On Wed, May 6, 2020 at 10:26 PM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Sure. But a great many things I calculate are not particularly exact. If I want the mean of about a hundred numbers that are each somewhere in the interval [1, 1e6], I'm probably not very interested in 1 ulp errors in 64-bit floating point. And when I *do* care about being exact, I can either cast the arguments to the appropriate type or raise an exception for the unexpected type. If my function deals with primes of thousands of digits, int is more appropriate. But maybe I want a Decimal of some specific precision. Or a Fraction. Or maybe I want to use gmpy as an external type for greater precision. If it's just `x = myfavoritetype(x)` as the first line of the function, that's easy to do.
Yeah, sometimes. But not nearly as much of an attractive nuisance as using `==` between to floating point numbers rather than math.isclose() or numpy.isclose(). My students trip over ` (0.1+0.2)+0.3 == 0.1+(0.2+0.3)` a lot more often than they trip over `1.0 == 1`. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Thu, May 7, 2020 at 12:26 PM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I most certainly DO care about accurate integer calculations, which is one of the reasons I'm very glad to have separate int and float types (ahem, ECMAScript, are you eavesdropping here?). In any situation where I would consider them equivalent, it's actually the float that I want (it's absolutely okay if I have to explicitly truncate a float to int if I want to use it in that context), so the only way they'd not be equivalent is if the number I'm trying to represent actually isn't representable. Having to explicitly say "n + 0.0" to force it to be a float isn't going to change that, so there's no reason to make that explicit. For the situations where things like fsum are important, it's great to be able to grab them. For situations where you have an integer number of seconds and want to say "delay this action by N seconds" and it wants a float? It should be fine accepting an integer.
Definitely not. I'm just arguing against your notion that equality should ONLY be between utterly equivalent things. It's far more useful to allow more things to be equal. ChrisA

On 7/05/20 1:07 pm, David Mertz wrote:
It's not much use for deciding whether two things *should* be equal, though, because whatever your opinion on the matter, you can come up with a set of functions that satisfy it and then say "those are the kinds of functions I mean". Also, as a definition of equality it seems somewhat circular, since if you're not sure whether x == y, you may be equally uncertain whether f(x) == f(y) for some f, x, y. -- Greg

On Thu, 7 May 2020 at 08:54, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
It's not so much a definition of equality as a consistency requirement. The contrapositive can be very clear: if you already know that f(x) and f(y) do different things or return unequal objects then the question of whether x == y is answered. It's important though that it's not just about equality of return types: when you carry the principle over from maths to programming then you need to consider non-pure functions, IO, exceptions being raised etc. In simple situations it is nice to be able to duck-type over lists and tuples but in practice it has to be done carefully by sticking to the sequence or iterable interfaces precisely or by coercing to a known type at the entry points of your code. Once you have a large codebase with lots of objects flying around internally and you no longer know whether anything is a list or a tuple (or a set...) any more it's just a mess. Oscar

On Sat, May 02, 2020 at 05:12:58AM -0000, Ahmed Amr wrote:
I'm going to throw out a wild idea (actually not that wild :-) that I'm sure people will hate for reasons I shall mention afterwards. Perhaps we ought to add a second "equals" operator? To avoid bikeshedding over syntax, I'm initially going to use the ancient 1960s Fortran syntax and spell it `.EQ.`. (For the avoidance of doubt, I know that syntax will not work in Python because it will be ambiguous. That's why I picked it -- it's syntax that we can all agree won't work, so we can concentrate on the semantics not the spelling.) We could define this .EQ. operate as *sequence equality*, defined very roughly as: def .EQ. (a, b): return len(a) == len(b) and all(x==y for x, y in zip(a, b)) (Aside: if we go down this track, this could be a justification for zip_strict to be a builtin; see the current thread(s) on having a version of zip which strictly requires its input to be equal length.) The precise details of the operator are not yet clear to me, for instance, should it support iterators or just Sized iterables? But at the very least, it would support the original request: [1, 2, 3] .EQ. (1, 2, 3) # returns True The obvious operator for this would be `===` but of course that will lead to an immediate and visceral reaction "Argghhh, no, Javascript, do not want!!!" :-) Another obvious operator would be a new keyword `eq` but that would break any code using that as a variable. But apart from the minor inconveniences that: - I don't know what this should do in detail, only vaguely; - and I have no idea what syntax it should have what do people think of this idea? -- Steven

This reminded me of another recent message so I decided to find that and link it here: https://mail.python.org/archives/list/python-ideas@python.org/message/7ILSYY... It seemed like a more useful thing to do before I discovered that you wrote that too... On Thu, May 7, 2020 at 11:20 AM Steven D'Aprano <steve@pearwood.info> wrote:

On Thu, 7 May 2020 19:11:43 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
Equality and its awkward cousin equivalence are slippery slopes. Just sequences? That (admittedly rough) function returns true for certain mapping arguments. What about case-insensitive string matching? Is that more common than comparing (or wanting to compare) arbitrary sequences? What about an operator for normalized (in the Unicode sense of the word), case-insensitive string comparison? (There's a precedent: Common Lisp's equalp function does case-insensitive string matching.) The strongest equality is the "is" operator, and then the == operator, and ISTM that you're now extending this idea to another class of equivalency. The very far ends of that scale are glossing over American vs. British spellings (are "color" and "colour" in some sense equal?), or even considering two functions "the same" if they produce the same outputs for the same inputs. One of Python's premises and strengths is strong typing; please don't start pecking away at that. Do beginners expect that [1, 2, 3] == (1, 2, 3)? No. Do experts expect that [1, 2, 3] == (1, 2, 3)? No. So who does? Programmers working on certain applications, or with multiple [pre-existing] libraries, or without a coherent design. These all seem like appliction level (or even design level) problems, or maybe a series of dunder methods / protocols to define various levels of equivalence (the ability of my inbox and my brain to handle the resulting bikeshedding notwithstanding). YMMV. Just my thoughts. -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan

On Thu, May 07, 2020 at 06:04:13AM -0400, Dan Sommers wrote:
*shrug* As I point out later in my post, I don't know whether it should be just sequences. Maybe it should be any iterable, although checking them for equality will necessarily consume them. But *right now* the proposal on the table is to support list==tuple comparisons, which this would do. (For an especially vague definition of "do" :-)
What about case-insensitive string matching?
That can be a string method, since it only needs to operate on strings.
The strongest equality is the "is" operator
Please don't encourage the conceptual error of thinking of `is` as *equality*, not even a kind of equality. It doesn't check for equality, it checks for *identity* and we know that there is at least one object in Python where identical objects aren't equal: py> from math import nan py> nan is nan True py> nan == nan False [...]
The very far ends of that scale are glossing over American vs. British spellings (are "color" and "colour" in some sense equal?),
YAGNI. The proposal here is quite simple and straightforward, there is no need to over-generalise it to the infinite variety of possible equivalencies than someone might want. People can write their own functions. It is only that wanting to compare two ordered containers for equality of their items without regard to the type of container is a reasonably common and useful thing to do. Even if we don't want list==tuple to return True -- and I don't! -- we surely can recognise that sometimes we don't care about the container's type, only it's elements. -- Steven

On Thu, 7 May 2020 21:18:16 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
We'd better agree to disagree on this one.
YAGNI is how I feel about an operator that compares sequences element by element. People can write their own functions. :-) Or add your .EQ. function to the standard library (or even to builtins, and no, I don't have a good name).
Do "reasonably common," "useful," and "sometimes" meet the bar for a new operator? (That's an honest question and not a sharp stick.) FWIW, I agree: list != tuple. When's the last time anyone asked for the next element of a tuple? (Okay, if your N-tuple represents a point in N-space, then you might iterate over the coordinates in order to discover a bounding box.) Dan -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan

On Thu, May 07, 2020 at 11:04:16AM -0400, Dan Sommers wrote:
Why? In what way is there any room for disagreement at all? This isn't a matter of subjective opinion, like what's the best Star Wars film or whether pineapple belongs on pizza. This is a matter of objective fact, like whether Python strings are Unicode or not. Whatever we might feel about equality and identity in the wider philosophical sense, in the *Python programming sense* the semantic meaning of the two operators are orthogonal: * some equal objects are not identical; * and some identical objects are not equal. It is a matter of fact that in Python `is` tests for object identity, not equality: https://docs.python.org/3/reference/expressions.html#is-not If you wish to agree with Bertrand Meyer that reflexivity of equality is one of the pillars of civilization: https://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civili... and therefore Python gets equality wrong, you are welcome to that opinion, but whether we like it or not equality in Python is not necessarily reflexive and as a consequence objects may be identical (i.e. the same object) but not equal. Float and Decimal NANs are the most obvious examples. You don't even need to look at such exotic objects as NANs to see that `is` does not test for equality. None of these will return True: [] is [] 1.5 is Fraction(3, 2) (a := {}) is a.copy() even though the operands are clearly equal. [...]
YAGNI is how I feel about an operator that compares sequences element by element.
Remember that list-to-list and tuple-to-tuple already perform the same sequence element-by-element comparison. All this proposal adds is *duck-typing* to the comparison, for when it doesn't matter what the container type is, you care only about the values in the container. Why be forced to do a possibly expensive (and maybe very expensive!) manual coercion to a common type just to check the values for equality element by element, and then throw away the coerced object? If you have ever written `a == list(b)` or similar, then You Already Needed It :-)
True, but there are distinct advantages to operators over functions for some operations. See Guido's essay: https://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html
It depends on how common and useful, and how easy it is to find a good operator. It might be a brilliant idea stymied by lack of a good operator. We might be forced to use a function because there are no good operators left any more, and nobody wants Python to turn into Perl or APL.
FWIW, I agree: list != tuple. When's the last time anyone asked for the next element of a tuple?
Any time you have written: for obj in (a, b, c): ... you are asking for the next element of a tuple. A sample from a test suite I just happen to have open at the moment: # self.isprime_functions is a tuple of functions to test for func in self.isprime_functions: for a in (3, 5, 6): self.assertFalse(sqrt_exists(a, 7)) for a in (2, 6, 7, 8, 10): self.assertFalse(sqrt_exists(a, 11)) -- Steven

On Fri, May 8, 2020 at 1:06 PM Steven D'Aprano <steve@pearwood.info> wrote:
You yourself introduced—speculatively—the idea of another equality operator, .EQ., that would be "equal in some sense not captured by '=='. I just posted another comment where I gave function names for six plausibly useful concepts of "equality" ... well, technically, equivalence-for-purpose. The distinction you make seems both pedantic and factually wrong. More flat-footed still is "equal objects are ones whose .__eq__() method returns something truthy." It doesn't actually need to define any of the behaviors we think of as equality/equivalence. I was going to write a silly example of e.g. throwing a random() into the operation, but I don't think I have to for the point to be obvious. Both '==' and 'is' are ways of saying equivalent-for-a-purpose. For that matter, so is math.isclose() or numpy.allclose(). Or those json-diff libraries someone just linked to. Given that different Python implementations will give different answers for 'some_int is some_other_int' where they are "equal" in an ordinary sense, identity isn't anything that special in most cases. Strings are likewise sometimes cached (but differently by version and implementation). The only cases where identity REALLY has semantics I would want to rely on are singletons like None and True, and I guess for custom mutable objects when you want to make sure which state is separated versus shared. Well, OK, I guess lists are an example of that already for the same reason. For non-singleton immutables, identity is not really a meaningful thing. I mean, other than in a debugger or code profiler, or something special like that. I honestly do not know whether, e.g. '(1, "a", 3.5) is (1, "a", 3.5)'. I'll go try it, but I won't be sure the answer for every implementation, version, and even runtime, whether that answer will be consistent. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

So I did try it. I did not necessarily expect these particular results. Moreover, I have a hunch that with PyPy JIT, something similar might actually give different answers at different points when the same line was encountered in a running interpreter. Not this example, but something else that might cache values only later. I haven't done anything sneaky with the version at those paths. They are all what the environment name hints they should be. PyPy is at 3.6, which is the latest version on conda-forge. 810-tmp % $HOME/miniconda3/envs/py2.7/bin/python -c 'print((1, "a", 3.5) is (1, "a", 3.5))' False 811-tmp % $HOME/miniconda3/envs/py3.4/bin/python -c 'print((1, "a", 3.5) is (1, "a", 3.5))' False 812-tmp % $HOME/miniconda3/envs/py3.8/bin/python -c 'print((1, "a", 3.5) is (1, "a", 3.5))' <string>:1: SyntaxWarning: "is" with a literal. Did you mean "=="? True 813-tmp % $HOME/miniconda3/envs/pypy/bin/python -c 'print((1, "a", 3.5) is (1, "a", 3.5))' True 814-tmp % $HOME/miniconda3/envs/py1/bin/python -c 'print (1, "a", 3.5) is (1, "a", 3.5)' 0 -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Sat, May 9, 2020 at 4:17 AM Alex Hall <alex.mojaki@gmail.com> wrote:
I think you're more seeing the module compilation optimizations here. Inside a single compilation unit (usually a module), constants will often be shared. So, for instance:
But if you do those lines individually at the REPL, you'll get False. Of course, a compliant Python interpreter is free to either collapse them or keep them separate, but this optimization helps to keep .pyc file sizes down, for instance. ChrisA

On Fri, May 08, 2020 at 01:26:05PM -0400, David Mertz wrote:
The distinction you make seems both pedantic and factually wrong.
Which distinction are you referring to? The one between `is` and `==`? And in what way is it factually wrong?
More flat-footed still is "equal objects are ones whose .__eq__() method returns something truthy."
Nevertheless, flat-footed or not, that is broadly the only meaning of equality that has any meaning in Python. Two objects are equal if, and only if, the `==` operator returns true when comparing them. That's what equality means in Python! (There are a few nuances and complexities to that when it comes to containers, which may short-cut equality tests with identity tests for speed.)
It doesn't actually need to define any of the behaviors we think of as equality/equivalence.
Indeed. Which is why we cannot require any of those behaviours for the concept of equality in Python.
Both '==' and 'is' are ways of saying equivalent-for-a-purpose.
`==` is the way to say "equal", where equal means whatever the class wants it to mean. If you want to describe that as "equivalent-for-a- purpose", okay. But `is` compares exactly and only "object identity", just as the docs say, just as the implementation, um, implements. That's not an equivalence, at least not in the plain English sense of the word, because an equivalence implies at least the possibility of *distinct* objects being equivalent: a is equivalent to b but a is not identical to b Otherwise why use the term "equivalent" when you actually mean "is the same object"? By definition you cannot have: a is identical to b but a is not identical to b so in this sense `is` is not a form of equivalence, it is just *is*. The mathematical sense of an equivalence relation is different: object identity certainly is an equivalence relation. [...]
Right. Remind me -- why are we talking about identity? Is it relevant to the proposal for a duck-typing container equals operator? [...]
So... only None, and True and False, and other singletons like NotImplemented, and custom mutable objects, and builtin mutable objects like list and dict and set, and typically for classes, functions and modules unless you're doing something weird. Okay.
For non-singleton immutables, identity is not really a meaningful thing.
It's of little practical use except to satisfy the caller's curiousity about implementation details. -- Steven

On Sat, 9 May 2020 03:01:15 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
I believe that the "is" operator is a test for some kind of equality, and you apparently don't.
Section 6.10 is entitled Comparisons, and lists both "is" and "==" as comparison operators. I admit that my use of the word "strongest" (to describe the "is" operator) and my conceptual ordering of different kinds of equality fails in the light of NaNs. Curse you, IEEE Floating Point! :-) Then again, that same documentation states "User-defined classes that customize their comparison behavior should follow some consistency rules, if possible." One of the consistency rules is "Equality comparison should be reflexive. In other words, identical objects should compare equal," and that rule is summarized as "x is y implies x == y." So I'm not the only one who thinks of "is" as a kind of equality. :-)
The OP wants [1, 2, 3] == (1, 2, 3) to return True, even though the operands are clearly not equal.
My mistake. I should have said "... compares arbitrary sequences of varying types ..." and not just "sequences."
Then I'll write a function that iterates over both sequences and compares the pairs of elements. There's no need to coerce one or both completes sequences.
If you have ever written `a == list(b)` or similar, then You Already Needed It :-)
I don't recall having written that. I do end up writing 'a == set(b)' when a is a set and b is a list, rather than building b as a set in the first place, but sets aren't sequences.
I have been known to write: for x in a, b, c: (without the parenthesis), usually in the REPL, but only because it's convenient and it works. In other programming languages that don't allow iteration over tuples, I use lists instead.

On Sat, May 9, 2020 at 4:43 AM Dan Sommers <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
The documentation doesn't say that "is" represents equality, but only that, in general, an object should be equal to itself. Identity is still a completely separate concept to equality. There's a concept of "container equality" that is expressed as "x is y or x == y", but that's still a form of equality check. "x is y" on its own is not an equality check. It's an identity check. Obviously it's a comparison, but so are many other things :) ChrisA

On 08.05.20 19:01, Steven D'Aprano wrote:
Initially I assumed that the reason for this new functionality was concerned with cases where the types of two objects are not precisely known and hence instead of converting them to a common type such as list, a direct elementwise comparison is preferable (that's probably uncommon though). Instead in the case where two objects are known to have different types but nevertheless need to be compared element-by-element, the performance argument makes sense of course. So as a practical step forward, what about providing a wrapper type which performs all operations elementwise on the operands. So for example: if all(elementwise(chars) == string): ... Here the `elementwise(chars) == string` part returns a generator which performs the `==` comparison element-by-element. This doesn't perform any length checks yet, so as a bonus one could add an `all` property: if elementwise(chars).all == string: ... This first checks the lengths of the operands and only then compares for equality. This wrapper type has the advantage that it can also be used with any other operator, not just equality. Here's a rough implementation of such a type: import functools import itertools import operator class elementwise: def __init__(self, obj, *, zip_func=zip): self.lhs = obj self.zip_func = zip_func def __eq__(self, other): return self.apply_op(other, op=operator.eq) def __lt__(self, other): return self.apply_op(other, op=operator.lt) ... # define other operators here def apply_op(self, other, *, op): return self.make_generator(other, op=op) def make_generator(self, other, *, op): return itertools.starmap(op, self.zip_func(self.lhs, other)) @property def all(self): zip_func = functools.partial(itertools.zip_longest, fillvalue=object()) return elementwise_all(self.lhs, zip_func=zip_func) class elementwise_all(elementwise): def apply_op(self, other, *, op): try: length_check = len(self.lhs) == len(other) except TypeError: length_check = True return length_check and all(self.make_generator(other, op=op))

On Sat, May 9, 2020 at 11:57 AM Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
Now `==` has returned an object that's always truthy, which is pretty dangerous.
This is now basically numpy. ``` In[14]: eq = numpy.array([1, 2, 3]) == [1, 2, 4] In[15]: eq Out[15]: array([ True, True, False]) In[16]: eq.all() Out[16]: False In[17]: eq.any() Out[17]: True In[18]: bool(eq) Traceback (most recent call last): ... ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() ``` I've used number instead of strings because numpy treats strings as units instead of iterables for this kind of purpose, so you'd have to do some extra wrapping in lists to explicitly ask for character comparisons.

On 09.05.20 12:18, Alex Hall wrote:
That can be resolved by returning a custom generator type which implements `def __bool__(self): raise TypeError('missing r.h.s. operand')`.
Actually I took some inspiration from Numpy but the advantage is of course not having to install Numpy. The thus provided functionality is only a very small subset of what Numpy provides.

On 09.05.20 14:16, Dominik Vilsmeier wrote:
After reading this again, I realized the error message is nonsensical in this context. It should be rather something like: `TypeError('The truth value of an elementwise comparison is ambiguous')` (again taking some inspiration from Numpy).

On May 9, 2020, at 02:58, Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
There’s an obvious use for the .all, but do you ever have a use for the elementwise itself? When do you need to iterate all the individual comparisons? (In numpy, an array of bools has all kinds of uses, starting with indexing or selecting with it, but I don’t think any of them are doable here.) And obviously this would be a lot simpler if it was just the all object rather than the elementwise object—and even a little simpler to use: element_compare(chars) == string (In fact, I think someone submitted effectively that under a different name for more-itertools and it was rejected because it seemed really useful but more-itertools didn’t seem like the right place for it. I have a similar “lexicompare” in my toolbox, but it has extra options that YAGNI. Anyway, even if I’m remembering right, you probably don’t need to dig up the more-itertools PR because it’s easy enough to redo from scratch.)

On 09.05.20 22:16, Andrew Barnert wrote: there's probably not much use for the elementwise iterator itself. So one could use `elementwise` as a namespace for `elementwise.all(chars) == string` and `elementwise.any(chars) == string` which automatically reduce the elementwise comparisons and the former also performs a length check prior to that. This would still leave the option of having `elementwise(x) == y` return an iterator without reducing (if desired).

On May 9, 2020, at 13:24, Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
But do you have any use for the .any? Again, it’s useful in NumPy, but would any of those uses translate? If you’re never going to use elementwise.any, and you’re never going to use elementwise itself, having elementwise.all rather than just making that the callable is just making the useful bit a little harder to access. And it’s definitely complicating the implementation, too. If you have a use for the other features, that may easily be worth it, but if you don’t, why bother? I took my lexicompare, stripped out the dependency on other helpers in my toolbox (which meant rewriting < in a way that might be a little slower; I haven’t tested) and the YAGNI stuff (like trying to be “view-ready” even though I never finished my views library), and posted it at https://github.com/abarnert/lexicompare (no promises that it’s stdlib-ready as-is, of course, but I think it’s at least a useful comparison point here). It’s pretty hard to beat this for simplicity: @total_ordering class _Smallest: def __lt__(self, other): return True @total_ordering class lexicompare: def __new__(cls, it): self = super(lexicompare, cls).__new__(cls) self.it = it return self def __eq__(self, other): return all(x==y for x,y in zip_longest(self.it, other, fillvalue=object())) def __lt__(self, other): for x, y in zip_longest(self.it, other, fillvalue=_Smallest()): if x < y: return True elif x < y: return False return False

On Thu, May 07, 2020 at 10:44:01PM +1200, Greg Ewing wrote:
Yes, but the *human readers* won't. You know that people will write things like: spam.EQ.ham and then nobody will know whether than means "call the .EQ. operator on operands spam and ham" or "lookup the ham attribute on the EQ attribute of spam" without looking up the parsing rules. Let's not turn into Ruby: https://lucumr.pocoo.org/2008/7/1/whitespace-sensitivity/ -- Steven

Why use "." which has clear syntax problems? This can already be done in current Python (this was linked to in a previous thread about something else) using a generic solution if you change the syntax: https://pypi.org/project/infix/ You could write it as |EQ|, ^EQ^, ... and have it in its own Pypi package. Not sure what IDEs think of this package, they probably hate it... On Thu, 7 May 2020 at 10:18, Steven D'Aprano <steve@pearwood.info> wrote:

On 07.05.20 11:11, Steven D'Aprano wrote:
But why do we even need a new operator when this simple function does the job (at least for sized iterables)? How common is it to compare two objects where you cannot determine whether one or the other is a tuple or a list already from the surrounding context? In the end these objects must come from somewhere and usually functions declare either list or tuple as their return type. Since for custom types you can already define `__eq__` this really comes down to the builtin types, among which the theoretical equality between tuple and list has been debated in much detail but is it used in practice?

On Thu, May 07, 2020 at 03:43:23PM +0200, Dominik Vilsmeier wrote:
Maybe it doesn't need to be an operator, but operators do have a big advantage over functions: http://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html On the other hand we only have a limited number of short symbols available in ASCII, and using words as operators reduces that benefit.
Never, because we can always determine whether something is a list or tuple by inspecting it with type() or isinstance(). But that's missing the point! I don't care and don't want to know if it is a tuple or list, I only care if it quacks like a sequence of some kind. The use-case for this is for when you want to compare elements without regard to the type of the container they are in. This is a duck-typing sequence element-by-element equality test. If you have ever written something like any of these: list(a) == list(b) tuple(a) == b ''.join(chars) == mystring all(x==y for x,y in zip(a, b)) then this proposed operator might be just what you need. -- Steven

On Fri, 8 May 2020 23:10:05 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 07, 2020 at 03:43:23PM +0200, Dominik Vilsmeier wrote:
To rephrase Dominik's question slighly, how often do you have a block of code with two sequences of unknown origin? Sure, I can *hypothisize* f(x, y) where x and y don't have to be anything more specific than sequences. But unless I'm actually writing .EQ., there's some code inside f that builds x or y, or calls some other function to obtain x or y, and then I know at least one of the types. You often ask for real world code that would be simpler or easier to read or maintain if such-and-such feature existed. The OP never posted any such thing; do you have any specific code in mind?
Ever? Maybe. Do I need help from a new operator or the standard library when it comes up? No, not really. And except for that join/mystring example, I find all of thse examples incredibly obvious and simple to read (although looking up a new function or operator isn't onerous, and I often learn things when that happens). -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan

On Fri, May 8, 2020 at 4:46 PM Henk-Jaap Wagenaar < wagenaarhenkjaap@gmail.com> wrote:
Steven mentioned that originally: We could define this .EQ. operate as *sequence equality*, defined very
But since you probably want these expressions to evaluate to false rather than raise an exception when the lengths are different, a strict zip is not appropriate.

Here's an example you might want to consider: >>> from collections import namedtuple >>> Point = namedtuple('Point', ['x', 'y']) >>> Point(1, 2) Point(x=1, y=2) >>> Point(1, 2) == (1, 2) True >>> Polar = namedtuple('Polar', ['r', 'theta']) >>> Polar(1, 2) Polar(r=1, theta=2) >>> Polar(1, 2) == (1, 2) True >>> Point(1, 2) == Polar(1, 2) True >>> hash(Point(1, 2)) == hash(Polar(1, 2)) == hash((1, 2)) True -- Jonathan

FYI, it does show in my version on gmail and on the mailman version. <https://mail.python.org/archives/list/python-ideas@python.org/message/WJKNLR...> BTW, I think strings do showcase some problems with this idea, .EQ. (as defined by Steven) is not recursive, which I think will be unworkable/unhelpful: ((0, 1), (1, 2)) and ([0, 1], [1, 2]) are not equal under the new operator (or new behaviour of == depending as per the OP) which I think goes completely against the idea in my book. If it were (replace x==y with x == y || x .EQ. y with appropriate error handling), strings would not work as expected (I would say), e.g.: [["f"], "o", "o"] .EQ. "foo" because a an element of a string is also a string. Worse though, I guess any equal length string that are not equal: "foo" .EQ. "bar" would crash as it would keep recursing (i.e. string would have to be special cased). What I do sometimes use/want (more often for casual coding/debugging, not real coding) is something that compares two objects created from JSON/can be made into JSON whether they are the same, sometimes wanting to ignore certain fields or tell you what the difference is. I do not think that could ever be an operator, but having a function that can help these kind of recursive comparisons would be great (I guess pytest uses/has such a function because it pretty nicely displays differences in sets, dictionaries and lists which are compared to each others in asserts). On Fri, 8 May 2020 at 16:23, Alex Hall <alex.mojaki@gmail.com> wrote:

On Fri, May 8, 2020 at 5:51 PM Henk-Jaap Wagenaar < wagenaarhenkjaap@gmail.com> wrote:
FYI, it does show in my version on gmail and on the mailman version. <https://mail.python.org/archives/list/python-ideas@python.org/message/WJKNLR...>
Weird, did Ethan's client cut it out?
If we redefined == so that `(0, 1) == [0, 1]`, then it would follow that `((0, 1), (1, 2)) == ([0, 1], [1, 2])`. Similarly if `(0, 1) .EQ. [0, 1]`, then it would follow that `((0, 1), (1, 2)) .EQ. ([0, 1], [1, 2])`.
Yes, strings would have to be special cased. In my opinion this is another sign that strings shouldn't be iterable, see the recent heated discussion at https://mail.python.org/archives/list/python-ideas@python.org/thread/WKEFHT4...
Something like https://github.com/fzumstein/jsondiff or https://pypi.org/project/json-diff/?

On 05/08/2020 09:36 AM, Alex Hall wrote:
On Fri, May 8, 2020 at 5:51 PM Henk-Jaap Wagenaar wrote:
FYI, it does show in my version on gmail and on the mailman version. <https://mail.python.org/archives/list/python-ideas@python.org/message/WJKNLR...>
Weird, did Ethan's client cut it out?
Ah, no. I thought you were replying to the code quote above the .EQ. one. The .EQ. quote was not white-space separated from the text around it and I missed it. -- ~Ethan~

All the discussion following Steven's hypothetical .EQ. operator (yes, not a possible spelling) just seems to drive home to me that what everyone wants is simply a function. Many different notions of "equivalence for a particular purpose" have been mentioned. We're not going to get a dozen different equality operators (even Lisp or Javascript don't go that far). But function names are plentiful. So just write your own: has_same_elements(a, b) case_insensitive_eq(a, b) same_json_representation(a, b) allclose(a, b) # A version of this is in NumPy recursively_equivalent(a, b) nan_ignoring_equality(a, b) And whatever others you like. All of these seem straightforwardly relevant to their particular use case (as do many others not listed). But none of them have a special enough status to co-opt the '==' operator or deserve their own special operator. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Fri, May 08, 2020 at 01:00:48PM -0400, David Mertz wrote:
All of which are red herrings that are completely off-topic for this proposal. This proposal has nothing to do with:
and the only reason I deleted the "recursively equivalent" one is because I don't know what it's supposed to mean. This proposal is a narrow one: its the same as list or tuple equality, but duck-typed so that the container type doesn't matter. Do lists and tuples do case-insensitive comparisons? No. Then neither does this proposal. Do lists and tuples do JSON-repr comparisons? No. Then neither does this. Do lists and tuples do numeric "within some epsilon" isclose comparisons (e.g. APL fuzzy equality)? Or ignore NANs? No to both of those. Then neither does this proposal. -- Steven

On Fri, May 08, 2020 at 07:52:10PM +0200, Alex Hall wrote:
Would the proposal come with a new magic dunder method which can be overridden, or would it be like `is`?
An excellent question! I don't think there needs to be a dunder. Calling this "sequence-equal": Two sequences are "sequence-equal" if: - they have the same length; - for each pair of corresponding elements, the two elements are either equal, or sequence-equal. The implementation may need to check for cycles (as ordinary equality does). It may also shortcut some equality tests by doing identity tests, as ordinary container equality does. -- Steven

On Fri, May 8, 2020 at 8:38 PM Steven D'Aprano <steve@pearwood.info> wrote:
The problem with this to me (and I think it's part of what David and others are saying) is that you're proposing additional syntax (for which there's usually a high bar) for the marginal benefit of improving a very specific use case. For comparison, the recent `@` operator is also intended for a very specific use case (matrix multiplication) but it can at least be reused for other purposes by overriding its dunder method. On top of that, we can see very clearly how the arguments in Guido's essay on operators applied to this case, with clear examples in https://www.python.org/dev/peps/pep-0465/#why-should-matrix-multiplication-b.... That doesn't apply so well to .EQ. as using `==` twice in a single expression isn't that common, and any specific flavour like .EQ. is even less common. `list(a) == list(b)` or `sequence_equal(a, b)` is suboptimal for visual mental processing, but it's still fine in most cases. I would be more supportive of some kind of 'roughly equals' proposal (maybe spelt `~=`) which could be overridden and did sequence equality, case insensitive string comparison, maybe approximate float comparison, etc. But even that has marginal benefit and I agree with the objections against it, particularly having 3 operators with similar equalish meanings. Perhaps a better alternative would be the ability to temporarily patch `==` with different meanings. For example, it could be nice to write in a test: with sequence_equals(): assert f(x, y) == f(y, x) == expected instead of: assert list(f(x, y)) == list(f(y, x)) == list(expected) or similarly with equals_ignoring_order(), equals_ignoring_case(), equals_ignoring_duplicates(), equals_to_decimal_places(2), equals_to_significant_figures(3), etc. This could be especially nice if it replaced implicit uses of `==` deeper in code. For example, we were recently discussing this function: ``` def zip_equal(*iterables): sentinel = object() for combo in zip_longest(*iterables, fillvalue=sentinel): if sentinel in combo: raise ValueError('Iterables have different lengths') yield combo ``` `sentinel in combo` is worrying because it uses `==`. For maximum safety we'd like to use `is`, but that's more verbose. What if we could write: ``` def zip_equal(*iterables): sentinel = object() with is_as_equals(): for combo in zip_longest(*iterables, fillvalue=sentinel): if sentinel in combo: raise ValueError('Iterables have different lengths') yield combo ``` and under the hood when `in` tries to use `==` that gets converted into `is` to make it safe? That's probably not the most compelling example, but I'm sure you can imagine ways in which `==` is used implicitly that could be useful to override. I'm not married to this idea, it's mostly just fun brainstorming.

On Sat, 9 May 2020 03:39:53 +1000 Steven D'Aprano <steve@pearwood.info> wrote:
This proposal is a narrow one: its the same as list or tuple equality, but duck-typed so that the container type doesn't matter.
Okay. Good. "Container," however, is a dangerous word in this context. According to https://docs.python.org/3/library/stdtypes.html, lots of things are "conrtainers." Can they all be sequence-equal to each other? Of particular note might be sets, which don't have an inherent order. I am in no way proposing that sequence-equal be extended to cover sets, which by definition can't really be a sequence.

On Fri, May 08, 2020 at 03:12:10PM -0400, Dan Sommers wrote:
All(?) sequences are containers, but not all containers are sequences, so no.
This is a very good question, thank you. I think that this ought to exclude mappings and sets, at least initially. Better to err on the side of caution than to overshoot by adding too much and then being stuck with it. The primary use-case here is for sequences. Comparisons between sets and sequences are certainly possible, but one has to decide on a case-by-case basis what you mean. For example, are these equal? {1, 2} and (1, 1, 2) I don't know and I don't want to guess, so leave it out. -- Steven

On Fri, May 8, 2020 at 1:47 PM Steven D'Aprano <steve@pearwood.info> wrote:
I think you are trying very hard to miss the point. Yes... all of those functions that express a kind of equivalence are different from the OP proposal. But ALL OF THEM have just as much claim to being called equivalence as the proposal does. If we could only extend the '==' operator to include one other comparison, I would not choose the OP's suggestion over those others. Similarly, if '===' or '.EQ.' could only have one meaning, the OP proposal would not be what I would most want. Which is NOT, of course, to say that I don't think `containers_with_same_contents()` isn't a reasonable function. But it's just that, a function. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Fri, May 08, 2020 at 03:16:48PM -0400, David Mertz wrote:
So what? Why is this relevant? This is not a proposal for a generalised equivalence relation. If you want one of those feel free to propose a competing idea. (To be pedantic: at least the allclose() one is not an equivalence relation, as it would be possible to have isclose(a, b) and isclose(b, c) but not isclose(a, c). But that's a by-the-by.) Duck-typed sequence-equality requires no specialised equivalence relation. It's exactly the same as existing notions of container equality, except without the type-check on the container. It is a generic operation, not a specialised one like checking for fuzzy numeric close-enoughness, or JSON representations. If you want case insensitive string equality, propose a new string method.
Great! Start your own proposal in a new thread then and stop hijacking this one. -- Steven

On Fri, May 8, 2020, 6:39 PM Steven D'Aprano
The OP, with a certain degree of support from you, is asking for changing the meaning of an operator to enshrine one particular equivalence relation as syntactically blessed by the language. I note that that equivalence relation is not more important than numerous other equivalence relations, and hence should simply be a function returning a Boolean answer. I'm certain you understand this, I'm not sure why the facade otherwise. If you think that yes, that has_same_items() really is that much more important, present the case for that rather than irrelevant pedantics. (To be pedantic: at least the allclose() one is not an equivalence
Yes. And moreover, we can have: numpy.isclose(a, b) != numpy.isclose(b, a) The math module had a different approach that guarantees symmetry. Neither is a bug, they are just different. But '==' does not guarantee either symmetry or transitivity either. Not even among objects that intense to mean it in more-or-less the ordinary sense. Practicality beats purity. If .isclose() calls things equivalent, them for most purposes the calculation will be fine if you substitute. A strict mathematical equivalence relation is more... Well, strict. But in terms of what programmers usually care about, this is fluff. Fwiw, my proposal is "just write a simple function." I've made that proposal several times in this thread... But I don't think it's exactly PEP in nature.

On Fri, May 08, 2020 at 07:24:43PM -0400, David Mertz wrote:
https://mail.python.org/archives/list/python-ideas@python.org/message/IRIOEX... Ahmed is no longer asking for any change to the `==` operator. That's multiple dozens of emails out of date.
I note that that equivalence relation is not more important than numerous other equivalence relations
"More important" according to whose needs? I would agree with you that a string method to do case insensitive comparisons would be very useful. I would certainly use that instead of a.casefold() == b.casefold() especially if there was an opportunity to make it more efficient and avoid copying of two potentially very large strings. But why is that relevant? There is no conflict or competition between a new string method and new operator. We could have both! "Case insensitive string comparisons would be useful" is an argument for case insensitive string comparisons, it's not an argument against an unrelated proposal.
and hence should simply be a function returning a Boolean answer.
Sure, we can always write a function. But for something as fundamental as a type of equality, there is much to be said for an operator. That's why we have operators in the first place, including `==` itself, rather than using the functions from the operator module.
I'm certain you understand this, I'm not sure why the facade otherwise.
Façade, "a showy misrepresentation intended to conceal something unpleasant" (WordNet). Synonyms include deception, fakery, false front, fraud, imposture, insincerity, simulacrum, subterfuge, and trick. I'm sorry to hear that you are *certain* of my state of mind, and even sorrier that you believe I am lying, but I assure you, I truly do believe that these other equivalence relations are not relevant. And here is why: (1) They require a specialised equivalence relation apart from `==`. Such as math.isclose(), a case insensitive comparison, a JSON comparison. (2) As such they ought to go into their specialist namespaces: - case-insensitive string comparisons should be a string method, or at worst, a function in the string module; - a JSON-comparison probably should go into the json module; - fuzzy numeric equality should probably go into the math module (and that's precisely where isclose() currently exists). And hence they are not in competition with this proposal. (3) Whereas the proposed duck-typing sequence equality relies on the ordinary meaning of equality, applied element by element, ignoring the type of the containers. We can think of this as precisely the same as list equality, or tuple equality, minus the initial typecheck that both operands are lists. If you can understand list equality, you can understand this. You don't have to ask "what counts as close enough? what's JSON?" etc. It's just the regular sequence equality but with ducktyping on containers. It's competely general in a way that the other equivalences aren't. If you think that these other proposals are worth having, and are more useful, then *make the proposal* and see if you get interest from other people. You said that you would prefer to have a JSON-comparing comparison operator. If you use a lot of JSON, I guess that might be useful. Okay, make the case for that to be an operator! I'm listening. I might be convinced. You might get that operator in 3.10, and Python will be a better language. Just start a new, competing, proposal for it. But if you're not prepared to make that case, then don't use the existence of something you have no intention of ever asking for as a reason to deny something which others do want. "Python doesn't have this hammer, therefore you shouldn't get this screwdriver" is a non-sequitor and a lousy argument.
That's what I'm trying to do.
Is this intended as an argument for or against this proposal, or is it another "irrelevant pedantics" you just accused me of making? In any case, it is an exaggerated position to take. Among ints, or strings, or floats excluding NANs, `==` holds with all the usual properties we expect: * x == x for all ints, strings and floats excluding NANs; * if, and only if, x == y, then y == x; * and if x == y and y == z, then x == z. It's only Python equality is the *general* sense where the operands could be any arbitrary object that those properties do not necessarily hold. Since this proposal is for a simple duck-typed sequence version of ordinary Python equality, the same generalisation will apply: * If the sequences hold arbitrary objects, we cannot necessarily make any claims about the properties of sequence-equality; * But if you can guarantee that all of the objects are such that the usual properties apply to the `==` operator, then you can say the same about sequence-equality. In this regard, it is exactly the same as list or tuple equality, except it duck-types the container types. -- Steven

On Fri, May 8, 2020 at 11:39 PM Steven D'Aprano <steve@pearwood.info> wrote:
"More important" according to whose needs?
I dunno. To mine? To "beginner programmers"? To numeric computation? I can weaken my 'note' to 'purport' if that helps. (3) Whereas the proposed duck-typing sequence equality relies on
the ordinary meaning of equality, applied element by element, ignoring the type of the containers.
I think this one is our main disagreement. I think a meaning for "equality" in which a tuple is equal (equivalent) to a list with the same items inside it is strikingly different from the ordinary meaning of equality. I don't deny that it is sometimes a useful question to ask, but it is a new and different question than the one answered by '==' currently. In my mind, this new kind of equality is MORE DIFFERENT from the current meaning than would be case-folded equivalence of strings, for example.
Actually, this could perfectly well live on the types rather than in the modules. I mean, I could do it today by defining .__eq__() on some subclasses of strings, floats, dicts, etc. if I wanted to. But hypothetically (I'm not proposing this), we could also define new operators .__eq2__(), .__eq3__(), etc. that would be called when Python programmers used the operators `===`, `====`, etc. With these new operators in hand, we might give meanings to these new kinds of equivalence: (1, 2, 3) === [1, 2, 3] # has_same_items() "David" === "daviD" # a.upper() == b.upper() "David" ==== "dabit" # soundex(a, b) 3.14159265 === 3.14159266 # math.isclose(a, b) It's competely general in a way that the other equivalences aren't.
Umm... no, it's really not. It's a special kind of equivalence that I guess applies to the Sequence ABC. Or maybe the Collection ABC? But to be really useful, it probably needs to work with things that don't register those ABCs themselves. I would surly expect: (1, 2, 3) === np.array([1, 2, 3]) Also, if this were a thing. But what about dicts, which are now ordered, and hence sequence-like? Or dict.keys() if not the dict itself? I'm sure reasonable answers could be decided for questions like that, but this is FAR from "completely general" or a transparent extension of current equality. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

On Fri, May 08, 2020 at 04:51:04PM +0100, Henk-Jaap Wagenaar wrote:
The sample implementation I gave was explicitly described as "very roughly". In no way was it intended to be the reference implementation. It was intended to begin the process of deciding on the semantics, not end it.
(1) Ahmed has already accepted that changing `==` is not suitable, so please let us all stop beating that dead horse! `==` is not going to change. (2) The intention is for ((0, 1), [1, 2], 'ab') .EQ. [[0, 1], (1, 2), ['a', 'b']] to return True, as well as similar examples. On the other hand, this is not just a "flattening equality" operator, this would return False: ((0, 1), (1, 2)) .EQ. ((0,), (1, 2, 3)) since (0, 1) and (0,) have different lengths.
Why would that not work? * ["f"] .EQ. "f" is true since they both have length 1 and their zeroth elements are equal; * "o" .EQ. "o" is true; * "o" .EQ. "o" is still true the second time :-) * so the whole thing is true.
Yes. Is that a problem? As I already pointed out, it will also need to handle cycles. For example: a = [1, 2] a.append(a) b = (1, 2, [1, 2, a]) and I would expect that a .EQ. b should be True: len(a) == len(b) a[0] == b[0] # 1 == 1 a[1] == b[1] # 2 == 2 a[2] == b[2] # a == a so that's perfectly well-defined.
Feel free to propose that as a separate issue. -- Steven

On 07/05/2020 10:11, Steven D'Aprano wrote:
The biggest argument against a second "equals" operator, however it is spelt, is confusion. Which of these two operators do I want to use for this subtly different question of equality? Even where we have quite distinct concepts like "==" and "is", people still get muddled. If we have "==" and "=OMG=" or whatever, that would just be an accident waiting to happen. Cheers, Rhodri -- Rhodri James *-* Kynesim Ltd

On Thu, May 07, 2020 at 04:42:22PM +0100, Rhodri James wrote:
On 07/05/2020 10:11, Steven D'Aprano wrote:
I don't think so. The confusion with `is` is particularly acute for at least two reasons: - in regular English it can be a synonym for equals, as in "one and one is two, two and two is four"; - it seems to work sometimes: `1 + 1 is 2` will probably succeed. If the operator was named differently, we probably wouldn't have many people writing `1 + 1 idem 2` or `1 + 1 dasselbe 2` when they wanted equality. I doubt many people would be confused whether they wanted, let's say, the `==` operator or the `same_items` operator, especially if `1 + 1 same_items 2` raised a TypeError. -- Steven
participants (24)
-
Ahmed Amr
-
Alex Hall
-
Andrew Barnert
-
Antoine Rozo
-
Chris Angelico
-
Dan Sommers
-
David Mertz
-
Dominik Vilsmeier
-
Eric V. Smith
-
Ethan Furman
-
Gerrit Holl
-
Greg Ewing
-
Guido van Rossum
-
Henk-Jaap Wagenaar
-
jdveiga@gmail.com
-
Jonathan Fine
-
Oscar Benjamin
-
Raymond Hettinger
-
Rhodri James
-
Richard Damon
-
Serhiy Storchaka
-
Soni L.
-
Steele Farnsworth
-
Steven D'Aprano