Currently, the `issubset` and `issuperset` methods of set objects accept arbitrary iterables as arguments. An iterable that is both a subset and superset is, in a sense, "equal" to the set. It would be inappropriate for `==` to return `True` for such a comparison, however, since that would break the `Hashable` contract.
Should sets have an additional method, something like `like(other)`, `issimilar(other)`, or `isequivalent(other)`, that returns `True` for any iterable that contains the all of the items in the set and no items that are not in the set? It would therefore be true in the same cases where `<set> = set(other)` or `<set>.issubset(other) and <set>.issuperset(other)` is true.
Hey Steve,
How about set.symmetric_difference()? Does it not do what you want?
Best regards, Bar Harel
On Sun, Mar 22, 2020, 10:03 PM Steve Jorgensen stevej@stevej.name wrote:
Currently, the `issubset` and `issuperset` methods of set objects accept arbitrary iterables as arguments. An iterable that is both a subset and superset is, in a sense, "equal" to the set. It would be inappropriate for `==` to return `True` for such a comparison, however, since that would break the `Hashable` contract.
Should sets have an additional method, something like `like(other)`, `issimilar(other)`, or `isequivalent(other)`, that returns `True` for any iterable that contains the all of the items in the set and no items that are not in the set? It would therefore be true in the same cases where `<set> = set(other)` or `<set>.issubset(other) and <set>.issuperset(other)` is true. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ULQQ7T... Code of Conduct: http://python.org/psf/codeofconduct/
Bar Harel wrote:
Hey Steve, How about set.symmetric_difference()? Does it not do what you want? Best regards, Bar Harel On Sun, Mar 22, 2020, 10:03 PM Steve Jorgensen stevej@stevej.name wrote:
Currently, the issubset and issuperset methods of set objects accept arbitrary iterables as arguments. An iterable that is both a subset and superset is, in a sense, "equal" to the set. It would be inappropriate for == to return True for such a comparison, however, since that would break the Hashable contract. Should sets have an additional method, something like like(other), issimilar(other), or isequivalent(other), that returns True for any iterable that contains the all of the items in the set and no items that are not in the set? It would therefore be true in the same cases where <set> = set(other) or <set>.issubset(other) and <set>.issuperset(other) is true.
Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ULQQ7T... Code of Conduct: http://python.org/psf/codeofconduct/
Indirectly, it does, but that returns a set, not a `bool`. It would also, therefore, do more work than necessary to determine the result in many cases.
A python implementation for what I'm talking about would be something like the following.
``` def like(self, other): found = set() for item in other: if item not in self: return False found.add(item) return len(found) == len(self) ```
On 22/03/2020 19:59, Steve Jorgensen wrote:
Currently, the `issubset` and `issuperset` methods of set objects accept arbitrary iterables as arguments. An iterable that is both a subset and superset is, in a sense, "equal" to the set. It would be inappropriate for `==` to return `True` for such a comparison, however, since that would break the `Hashable` contract.
Should sets have an additional method, something like `like(other)`, `issimilar(other)`, or `isequivalent(other)`, that returns `True` for any iterable that contains the all of the items in the set and no items that are not in the set? It would therefore be true in the same cases where `<set> = set(other)` or `<set>.issubset(other) and <set>.issuperset(other)` is true.
You worded the above carefully, but it may not be universally obvious that <set>.isequivalent(other) == True does not imply len(<set>) == len(other) (assuming `other` has a len() method), e.g. <set> = set("a") other = list("aa") I think this point should be documented, lest someone interpret `isequivalent` as "equals".
A similar point holds with the existing `issuperset` method (a set can be shorter than something it is a superset of), but I think there would be more danger of confusion with this new method.
+0 on the proposal Rob Cliffe
On Sun, 22 Mar 2020 at 20:01, Steve Jorgensen stevej@stevej.name wrote:
Currently, the `issubset` and `issuperset` methods of set objects accept arbitrary iterables as arguments. An iterable that is both a subset and superset is, in a sense, "equal" to the set. It would be inappropriate for `==` to return `True` for such a comparison, however, since that would break the `Hashable` contract.
Should sets have an additional method, something like `like(other)`, `issimilar(other)`, or `isequivalent(other)`, that returns `True` for any iterable that contains the all of the items in the set and no items that are not in the set? It would therefore be true in the same cases where `<set> = set(other)` or `<set>.issubset(other) and <set>.issuperset(other)` is true.
What is the practical use case for this? It seems like it would be a pretty rare need, at best. Paul
On Sun, Mar 22, 2020 at 07:59:59PM -0000, Steve Jorgensen wrote:
Currently, the `issubset` and `issuperset` methods of set objects accept arbitrary iterables as arguments. An iterable that is both a subset and superset is, in a sense, "equal" to the set. It would be inappropriate for `==` to return `True` for such a comparison, however, since that would break the `Hashable` contract.
I think the "arbitrary iterables" part is a distraction. We are fundamentally talking about a comparison on sets, even if Python relaxes the requirements and also allows one operand to be a arbitrary iterable.
I don't believe that a set A can be both a superset and subset of another set B at the same time. On a Venn Diagram, that would require A to be both completely surrounded by B and B to be completely surrounded by A at the same time, which is impossible.
I think you might be talking about sets which partially overlap:
A = {1, 2, 3, 4} B = {2, 3, 4, 5}
but neither the issubset nor issuperset methods return True in that case:
* A is not a subset of B because 1 is not in B; * A is not a superset of B because it lacks 5; * B is not a subset of A because 5 is not in A; * and B is not a superset of A because it lacks 1.
You are right that it would be inappropriate to return equal, but nothing to do with Hashable since sets aren't hashable. They are not equal because, well, they ain't equal :-)
Should sets have an additional method, something like `like(other)`, `issimilar(other)`, or `isequivalent(other)`, that returns `True` for any iterable that contains the all of the items in the set and no items that are not in the set?
That would be equality :-)
A = {1, 2, 3, 4} B = {1, 2, 3, 4}
B contains all of the items in A and no items which are not in A; likewise A contains all the items in B and no items not in B. That makes them equal.
I might be missing something obvious, but I really don't think that `<set>.issubset(other) and <set>.issuperset(other)` can be true.
Steven D'Aprano wrote:
On Sun, Mar 22, 2020 at 07:59:59PM -0000, Steve Jorgensen wrote:
Currently, the issubset and issuperset methods of set objects accept arbitrary iterables as arguments. An iterable that is both a subset and superset is, in a sense, "equal" to the set. It would be inappropriate for == to return True for such a comparison, however, since that would break the Hashable contract. I think the "arbitrary iterables" part is a distraction. We are
fundamentally talking about a comparison on sets, even if Python relaxes the requirements and also allows one operand to be a arbitrary iterable. I don't believe that a set A can be both a superset and subset of another set B at the same time. On a Venn Diagram, that would require A to be both completely surrounded by B and B to be completely surrounded by A at the same time, which is impossible. I think you might be talking about sets which partially overlap: A = {1, 2, 3, 4} B = {2, 3, 4, 5}
Every set is a superset of itself and a subset of itself. A set may not be a "formal" subset or a "formal" superset of itself. `issubset` and `issuperset` refer to standard subsets and supersets, not formal subsets and supersets.
In Python, you can trivially check that… ``` In [1]: {1, 2, 3}.issubset({1, 2, 3}) Out[1]: True
In [2]: {1, 2, 3}.issuperset({1, 2, 3}) Out[2]: True
In [3]: {1, 2, 3}.issubset((1, 2, 3)) Out[3]: True
In [4]: {1, 2, 3}.issuperset((1, 2, 3)) Out[4]: True ```
Paul Moore wrote:
On Sun, 22 Mar 2020 at 20:01, Steve Jorgensen stevej@stevej.name wrote:
Currently, the issubset and issuperset methods of set objects accept arbitrary iterables as arguments. An iterable that is both a subset and superset is, in a sense, "equal" to the set. It would be inappropriate for == to return True for such a comparison, however, since that would break the Hashable contract. Should sets have an additional method, something like like(other), issimilar(other), or isequivalent(other), that returns True for any iterable that contains the all of the items in the set and no items that are not in the set? It would therefore be true in the same cases where <set> = set(other) or <set>.issubset(other) and <set>.issuperset(other) is true. What is the practical use case for this? It seems like it would be a
pretty rare need, at best. Paul
Basically, it is for a sense of completeness. It feels weird that there is a way to check whether an iterable is a subset of a set or a superset of a set but no way to directly ask whether it is equivalent to the set.
Even though the need for it might not be common, I think that the collection of methods makes more sense if a method like this is present.
On Mon, Mar 23, 2020 at 12:03:50AM -0000, Steve Jorgensen wrote:
Every set is a superset of itself and a subset of itself. A set may not be a "formal" subset or a "formal" superset of itself. `issubset` and `issuperset` refer to standard subsets and supersets, not formal subsets and supersets.
Sorry, I don't understand your terminology "formal" and "standard". I think you might mean "proper" rather than formal? But I don't know what you mean by "standard".
We might have a terminology issue here, since according to Wikipedia there is some dispute over whether or not to include the equality case in subset/superset:
https://en.wikipedia.org/wiki/Subset
For what it is worth, I'm in the school that subset implies proper subset, i.e. A ⊂ B implies that A ≠ B and that sets are *not* subsets (or supersets) of themselves.
To be explicit, A ⊂ B means the same as A ⊊ B not A ⊆ B.
But I can see that people's usage on this varies, so I won't argue that one way or the other is "wrong". So long as we agree on which convention we are using.
So to be clear, you are referring to the improper subset, which in Python we write as the first comparion, not the second:
py> {1} <= {1} # like {1}.issubset({1}) True py> {1} < {1} False
In Python, you can trivially check that…
In [1]: {1, 2, 3}.issubset({1, 2, 3}) Out[1]: True
Okay, that's an *improper* subset, since the two sets are equal. That's documented as more-or-less equivalent to `A <= B`:
https://docs.python.org/3/library/stdtypes.html#frozenset.issubset
(The only difference is that the operator version requires actual sets, while the method version accepts any iterable.)
Do you have an example of `A <= B and B <= A` aside from the `A == B` case?
On Mon, Mar 23, 2020 at 12:08:23AM -0000, Steve Jorgensen wrote:
Basically, it is for a sense of completeness. It feels weird that there is a way to check whether an iterable is a subset of a set or a superset of a set but no way to directly ask whether it is equivalent to the set.
I still don't see what you consider "equivalent" aside from the equality case.
Your request would be much more clear and easy to understand if you had started with concrete examples, especially since the terminology you are using is ambiguous.
there is a way to check whether an iterable is a subset of a set or a
superset of a set but no way to directly ask whether it is equivalent to the set.
A_set == set(an_iterable)
Seems straightforward to me :-)
I see that subset will accept an arbitrary iterable, whereas __eq__ does not— but I think that’s more because there’s no reason for subser and friends NOT to work on an arbitrary iterable than because there’s a compelling reason the should.
The same is not true for __eq__.
-CHB
I still don't see what you consider "equivalent" aside from the equality case.
Your request would be much more clear and easy to understand if you had started with concrete examples, especially since the terminology you are using is ambiguous.
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZWO36M... Code of Conduct: http://python.org/psf/codeofconduct/
Steve Jorgensen wrote:
Basically, it is for a sense of completeness. It feels weird that there
is a way to check whether an iterable is a subset of a set or a superset
of a set but no way to directly ask whether it is equivalent to the set.
I can't say this has never happened historically, but I've yet to ever see a change proposal accepted simply on the basis of "completeness". There typically has to be some form of concrete, practical use case for the feature. Otherwise, it's quite simply not going to be worth the review and long-term maintenance cost to the limited core CPython development resources.
Even though the need for it might not be common, I think that the
collection of methods makes more sense if a method like this is present.
It's okay if the need isn't very common, but it still typically requires some real use case to be clearly defined. Even if it might be fairly niche, are there any?
(Note that if it's too niche, it's likely more suitable as a 3rd party package.)
On Sun, Mar 22, 2020 at 8:11 PM Steve Jorgensen stevej@stevej.name wrote:
Paul Moore wrote:
On Sun, 22 Mar 2020 at 20:01, Steve Jorgensen stevej@stevej.name wrote:
Currently, the issubset and issuperset methods of set objects accept arbitrary iterables as
arguments. An
iterable that is both a subset and superset is, in a sense, "equal" to
the set. It would
be inappropriate for == to return True for such a comparison, however, since that would break the Hashable contract. Should sets have an additional method, something like like(other), issimilar(other), or isequivalent(other), that returns True for any iterable that contains the all of the items in the set
and no
items that are not in the set? It would therefore be true in the same
cases where
<set> = set(other) or <set>.issubset(other) and <set>.issuperset(other) is true. What is the practical use case for this? It seems like it would be a
pretty rare need, at best. Paul
Basically, it is for a sense of completeness. It feels weird that there is a way to check whether an iterable is a subset of a set or a superset of a set but no way to directly ask whether it is equivalent to the set.
Even though the need for it might not be common, I think that the collection of methods makes more sense if a method like this is present. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/MRCHHR... Code of Conduct: http://python.org/psf/codeofconduct/
On Mar 22, 2020, at 18:54, Steven D'Aprano steve@pearwood.info wrote:
Do you have an example of `A <= B and B <= A` aside from the `A == B` case?
For mathematical sets, this is either impossible by definition, or impossible by 2-line proof, depending on which definitions you like. For Python sets, presumably you could come up with something pathological involving some kind of “anti-NaN” value that’s equal to everything?
But I don’t think that’s relevant here. You claimed that 'the "arbitrary iterables" part is a distraction', but I think it’s actually the whole point of the proposal. The initial suggestion is that there are lots of iterables that are both a subset and a superset of some set, and only the ones that are sets are equal to the set, and not having a way to test for the ones that aren’t sets is the “missing functionality” that needs to be added for completeness.
And that’s what you’re missing at the end of your previous message: you don’t see “how `<set>.issubset(other) and <set>.issuperset(other)` can be true” if the set and other aren’t equal: it’s true whenever other isn’t a set. And if you need it more concrete:
>>> {1.2}.issubset((1,2)) True >>> {1.2}.issuperset((1,2)) True >>> {1,2} == (1,2) False
So, asking for an example of ones that are sets but still can’t be tested for equality is non sequitur. The OP didn’t expect there to be any, and isn’t trying to solve that problem even if there are.
As far as I can tell, they’re just trying to add a method isequivalent or iscoextensive or whatever that extends beyond == to handle non-set iterables in the exact same way issubset and issuperset extend beyond <= and >= to handle non-set iterables.
That’s a perfectly coherent request, and of course arbitrary iterables are relevant to it. The question is why anyone would ever need that method. If someone has a use case for it, they should be able to post it easily.
On Sun, Mar 22, 2020 at 6:51 PM Steven D'Aprano steve@pearwood.info wrote:
We might have a terminology issue here, since according to Wikipedia there is some dispute over whether or not to include the equality case in subset/superset:
https://en.wikipedia.org/wiki/Subset
For what it is worth, I'm in the school that subset implies proper subset [...]
Wikipedia's pedantry notwithstanding, I don't think this is a useful position *when talking about Python sets*, since Python's set's .issubset() method returns True when the argument is the same set:
{1}.issubset({1})
True
Or did I miss a wink?
On Sun, Mar 22, 2020 at 09:49:26PM -0700, Guido van Rossum wrote:
On Sun, Mar 22, 2020 at 6:51 PM Steven D'Aprano steve@pearwood.info wrote:
We might have a terminology issue here, since according to Wikipedia there is some dispute over whether or not to include the equality case in subset/superset:
https://en.wikipedia.org/wiki/Subset
For what it is worth, I'm in the school that subset implies proper subset [...]
Wikipedia's pedantry notwithstanding, I don't think this is a useful position *when talking about Python sets*, since Python's set's .issubset() method returns True when the argument is the same set:
But Python's subset *operator* returns False when the arguments are equal:
py> {1} < {1} False
and until today I did not realise the operator and method used different definitions. I'm going to have to review some of my code using sets to see if my mistaken understanding that they were the same has introduced some hidden bugs :-(
I'm not going to get into an argument about which definition is correct. I think the differences between the operators and the methods should be better described in the docs, but at least the docs do explicitly note that set.issubset is equivalent to the `<=` operator.
(Not the docstring though. You have to read the docs on the website.)
Today I read Wikipedia's pedantry and learned for the first time that there are people who define "subset" to include equality. I had not come across that before, so I was confused by Steve Jorgensen's talk about sets that are similtaneously subsets and supersets of themselves. But now that I realise he is talking about *equal* sets, it makes sense.
I'm still confused by his position that there are sets (iterables?) which are simultaneously subsets and supersets of each other without being equal, but that's a separate issue for him to clarify.
On Sun, Mar 22, 2020 at 08:55:47PM -0700, Andrew Barnert wrote: [...]
But I don’t think that’s relevant here. You claimed that 'the "arbitrary iterables" part is a distraction', but I think it’s actually the whole point of the proposal. The initial suggestion is that there are lots of iterables that are both a subset and a superset of some set, and only the ones that are sets are equal to the set, and not having a way to test for the ones that aren’t sets is the “missing functionality” that needs to be added for completeness.
Well I'm glad that you got that out of Steve's posts, because I didn't :-)
Assuming you are correct, isn't that easily done with a type conversion?
A == set(B)
We might argue about the inefficiency of having to build a set only to throw it away, but given that there's no real use-case for this (so far), only a sense of completeness, it might be good enough. Or one could do:
A.issubset(B) and A.issuperset(B)
assuming B isn't an iterator.
As far as I can tell, they’re just trying to add a method isequivalent or iscoextensive or whatever that extends beyond == to handle non-set iterables in the exact same way issubset and issuperset extend beyond <= and >= to handle non-set iterables.
If it were a method, set.equals() is the obvious name, since that's what it is actually testing for: set equality, without the conversion to a set.
On 3/23/2020 5:49 AM, Guido van Rossum wrote:
On Sun, Mar 22, 2020 at 6:51 PM Steven D'Aprano <steve@pearwood.info mailto:steve@pearwood.info> wrote:
We might have a terminology issue here, since according to Wikipedia there is some dispute over whether or not to include the equality case in subset/superset: https://en.wikipedia.org/wiki/Subset For what it is worth, I'm in the school that subset implies proper subset [...]
Wikipedia's pedantry notwithstanding, I don't think this is a useful position *when talking about Python sets*, since Python's set's .issubset() method returns True when the argument is the same set:
{1}.issubset({1})
True
Or did I miss a wink?
The Python way is the mathematically correct way of interpreting the terms "subset" and "superset".
Note that the discussion Steven referred to targets the symbol to be used, ie. whether "⊂" (proper subset) should mean the same as "⊆" (subset or equal). As with everything in math, such details are always defined via definitions in the respective paper or book. This can be confusing for the casual reader, but it's not "right" or "wrong".
Personally, I find the analogy to "<" vs. "≤" very reasonable, but that still doesn't imply anything wrong with Python, since the "proper subset" property would naturally be called ".ispropersubsect()" :-)
On Mon, Mar 23, 2020 at 8:00 PM Steven D'Aprano steve@pearwood.info wrote:
On Sun, Mar 22, 2020 at 09:49:26PM -0700, Guido van Rossum wrote:
On Sun, Mar 22, 2020 at 6:51 PM Steven D'Aprano steve@pearwood.info wrote:
We might have a terminology issue here, since according to Wikipedia there is some dispute over whether or not to include the equality case in subset/superset:
https://en.wikipedia.org/wiki/Subset
For what it is worth, I'm in the school that subset implies proper subset [...]
Wikipedia's pedantry notwithstanding, I don't think this is a useful position *when talking about Python sets*, since Python's set's .issubset() method returns True when the argument is the same set:
But Python's subset *operator* returns False when the arguments are equal:
py> {1} < {1} False
{1} <= {1}
True
Python gives you two operators :)
ChrisA
On 23/03/2020 09:12, Steven D'Aprano wrote:
On Sun, Mar 22, 2020 at 08:55:47PM -0700, Andrew Barnert wrote: [...]
But I don’t think that’s relevant here. You claimed that 'the "arbitrary iterables" part is a distraction', but I think it’s actually the whole point of the proposal. The initial suggestion is that there are lots of iterables that are both a subset and a superset of some set, and only the ones that are sets are equal to the set, and not having a way to test for the ones that aren’t sets is the “missing functionality” that needs to be added for completeness.
Well I'm glad that you got that out of Steve's posts, because I didn't :-)
Assuming you are correct, isn't that easily done with a type conversion?
A == set(B)
We might argue about the inefficiency of having to build a set only to throw it away, but given that there's no real use-case for this (so far), only a sense of completeness, it might be good enough. Or one could do:
A.issubset(B) and A.issuperset(B)
assuming B isn't an iterator.
As far as I can tell, they’re just trying to add a method isequivalent or iscoextensive or whatever that extends beyond == to handle non-set iterables in the exact same way issubset and issuperset extend beyond <= and >= to handle non-set iterables.
If it were a method, set.equals() is the obvious name, since that's what it is actually testing for: set equality, without the conversion to a set.
s = set("a") t = list("aa") s.issubset(t)
True
s.issuperset(t)
True
but it would be misleading IMO to say that s and t are in some sense equal. Explicit conversion to a set is the right way to compare sets. (Perhaps issubset/issuperset should not accept non-set iterables, but that ship has sailed.) Rob Cliffe
< is the strict subset operator. <= is subset. What more can I say?
On Mon, Mar 23, 2020 at 02:01 Steven D'Aprano steve@pearwood.info wrote:
On Sun, Mar 22, 2020 at 09:49:26PM -0700, Guido van Rossum wrote:
On Sun, Mar 22, 2020 at 6:51 PM Steven D'Aprano steve@pearwood.info
wrote:
We might have a terminology issue here, since according to Wikipedia there is some dispute over whether or not to include the equality case in subset/superset:
https://en.wikipedia.org/wiki/Subset
For what it is worth, I'm in the school that subset implies proper
subset
[...]
Wikipedia's pedantry notwithstanding, I don't think this is a useful position *when talking about Python sets*, since Python's set's
.issubset()
method returns True when the argument is the same set:
But Python's subset *operator* returns False when the arguments are equal:
py> {1} < {1} False
and until today I did not realise the operator and method used different definitions. I'm going to have to review some of my code using sets to see if my mistaken understanding that they were the same has introduced some hidden bugs :-(
I'm not going to get into an argument about which definition is correct. I think the differences between the operators and the methods should be better described in the docs, but at least the docs do explicitly note that set.issubset is equivalent to the `<=` operator.
(Not the docstring though. You have to read the docs on the website.)
Today I read Wikipedia's pedantry and learned for the first time that there are people who define "subset" to include equality. I had not come across that before, so I was confused by Steve Jorgensen's talk about sets that are similtaneously subsets and supersets of themselves. But now that I realise he is talking about *equal* sets, it makes sense.
I'm still confused by his position that there are sets (iterables?) which are simultaneously subsets and supersets of each other without being equal, but that's a separate issue for him to clarify.
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4CAPJG... Code of Conduct: http://python.org/psf/codeofconduct/
On 23/03/20 11:10 pm, M.-A. Lemburg wrote:
The Python way is the mathematically correct way of interpreting the terms "subset" and "superset".
Generally in maths it's usually more convenient if the simplest terminology includes edge cases. For example, the definition of a powerset would be more wordy if the term "subset" didn't include equality.
On Mon, Mar 23, 2020 at 02:05:49PM +0000, Rob Cliffe via Python-ideas wrote:
s = set("a") t = list("aa") s.issubset(t)
True
s.issuperset(t)
True
but it would be misleading IMO to say that s and t are in some sense equal.
In *some* sense they are equal:
- every element in s is also in t; - every element in t is also in s; - no element in s is not in t; - no element in t is not in s; - modulo uniqueness, both s and t have the same elements; - converting t to a set gives {'a'} which is equal to s.
I don't know that this is an *important* sense, but the OP Steve J isn't wrong to notice it.
I shouldn't need to say this, but for the record I am not proposing and do not want set equality to support lists; nor do I see the need for a new method to perform "equivalent to equality" tests; but if the consensus is that sets should have that method, I would prefer it to be given the simpler name:
set.superset # not .equivalent_to_superset set.subset # not .equivalent_to_subset set.equals # not some variation of .equivalent_to_equals
On Mar 23, 2020, at 16:57, Steven D'Aprano steve@pearwood.info wrote:
I shouldn't need to say this, but for the record I am not proposing and do not want set equality to support lists; nor do I see the need for a new method to perform "equivalent to equality" tests; but if the consensus is that sets should have that method, I would prefer it to be given the simpler name:
set.superset # not .equivalent_to_superset set.subset # not .equivalent_to_subset set.equals # not some variation of .equivalent_to_equals
The existing methods are named issubset and issuperset (and isdisjoint, which doesn’t have an operator near-equivalent). Given that, would you still want equals instead of isequal or something?
Personally, I don’t like the idea of an “equals” method. Maybe it’s just Scheme/Smalltalk/ObjC/Ruby/etc. flashbacks, but a builtin type having an equals method that’s different from the == operator makes me expect some horrible convention where all types have two or more ways to check different notions of equality, and generally == is stricter than eq is stricter than eql is stricter than equals (although IIRC it’s the other way round in Ruby?), but every time you read any comparison you have to go look at the type’s help to see exactly what “stricter” means for that type.
So, I’d rather have an uglier, more explicit, and more obviously specific-to-set name like iscoextensive. Sure, not everyone will know what “coextensive” means, but people who don’t know will not have any use for a method that means “compare these things for equality as if they were sets even though they may not be sets” in the first place.
Of course that’s assuming _anyone_ has a need for this method, and I suspect you’re right that it’s rare enough (and easy enough to work around) that we don’t need to add anything in the first place.
On Mon, Mar 23, 2020 at 06:03:06PM -0700, Andrew Barnert wrote:
The existing methods are named issubset and issuperset (and isdisjoint, which doesn’t have an operator near-equivalent). Given that, would you still want equals instead of isequal or something?
Oops! I mean, aha, you passed my test to see if you were paying attention, well done!
*wink*
I would be satisfied by "isequal", if the method were needed.
So, I’d rather have an uglier, more explicit, and more obviously specific-to-set name like iscoextensive. Sure, not everyone will know what “coextensive” means
That's an unnecessary use of jargon that doesn't add clarity or precision and can only add confusion. Outside of the tiny niche of people trying to prove the foundations of mathematics in first-order logic and set theory, when mathematicians want to say two sets are equal, they say they are equal, they don't use the term “coextensive”.
See, for example, Mathworld, which uses "=" in the usual way:
https://mathworld.wolfram.com/Set.html
and doesn't even have a definition for “coextensive”:
https://mathworld.wolfram.com/search/?query=coextensive&x=0&y=0
The foundations of mathematics is one of those things which are extremely over-represented on the Internet compared to the experience of real-life mathematicians. Gödel's Theorems are another one. My cousin is a professor of mathematics at Rutgers and when I asked her about Gödel her response was "Who?".
Mathematics is huge. Working on the foundations is a tiny niche, in the same way that the set of people working with electronics is huge while the set of people working on the foundations of how electrons travel through copper wires at a quantum mechanical level is tiny.
On Mar 23, 2020, at 19:52, Steven D'Aprano steve@pearwood.info wrote:
On Mon, Mar 23, 2020 at 06:03:06PM -0700, Andrew Barnert wrote: The existing methods are named issubset and issuperset (and isdisjoint, which doesn’t have an operator near-equivalent). Given that, would you still want equals instead of isequal or something?
Oops! I mean, aha, you passed my test to see if you were paying attention, well done!
*wink*
I would be satisfied by "isequal", if the method were needed.
So, I’d rather have an uglier, more explicit, and more obviously specific-to-set name like iscoextensive. Sure, not everyone will know what “coextensive” means
That's an unnecessary use of jargon
As far as I’m aware (and your MathWorld search bears this out), “coextensive” isn’t mathematical jargon. I chose it because of its ordinary (if not super-common) English meaning. For example, the charter of the City of San Francisco says that the city is coextensive with the County of San Francisco, meaning that any bit of land in the city is in the county and vice-versa. I’m sure it’s not the only word that’s close enough to be pressed into service (historical Christian theology has to be full of words for similar ideas, right?); as I said; it’s just the first one that came to mind. Maybe something longer like has_same_elements would be better. But I would be a little surprised if any really common single word got the idea across.
that doesn't add clarity or precision and can only add confusion. Outside of the tiny niche of people trying to prove the foundations of mathematics in first-order logic and set theory, when mathematicians want to say two sets are equal, they say they are equal, they don't use the term “coextensive”.
Sure, but in math, just as in Python, a set is never equal to a list (with a tiny foundations asterisk that isn’t relevant, but I’ll mention it below).
So when mathematicians want to say a set is equal to a list… well, they just don’t say that, because it isn’t true, and usually isn’t even a meaningful question in the first place.
See, for example, Mathworld, which uses "=" in the usual way:
And in Python, it’s spelled “==“ instead of “=“, but otherwise, it already works the same way.
and doesn't even have a definition for “coextensive”:
https://mathworld.wolfram.com/search/?query=coextensive&x=0&y=0
Sure. In fact, I suspect there’s no standard symbol or name for this operation. Just like there’s no standard name for “divisible by 3 but not by 2”, there’s no standard name for “not necessarily equal, but the sets of their respective elements are equal”. Those descriptions are way too long for method names even in Objective C, much less Python, but that doesn’t mean you can call a method for the former “isodd”, or the latter “isequal”, because those names more strongly imply a much more commonly useful operation than they do the one you’re intending.
Of course (both in math and in Python) probably you just write the 1-liner in-line, or maybe give it a “local” name and expect people to refer to the definition only 10 lines above. We only really need a good, meaningful, non-confusing name for this operation if it’s really important enough to add as a builtin method.
The foundations of mathematics is one of those things which are extremely over-represented on the Internet
Well, you’re the only one who brought up foundations here, and in the same email where you’re railing against people talking about foundations on the internet, so I’m not sure what the point is.
My guess is that you saw Greg, Marc-Andre, etc. talking about sets in mathematics and naturally thought “oh no, here comes irrelevant mathematical logic”, but there isn’t any; the reason sets came up is that we’re talking about set operations on Python set objects, and it’s pretty hard to talk about what issubset means/should mean without talking about sets. (And the message you’re replying to here doesn’t even do that; it’s just about the practical issues of equals methods in programming languages.)
But I’ll give it something to be retroactively sequitur to, that tiny asterisk mentioned above: When you’re dealing with foundations, you do often define everything, including lists, as sets, so picking one common set of definitions arbitrarily, {1,{1,{2,{2,3}}}} = [1,2,3] is actually true. But that isn’t relevant, and wouldn’t be relevant even if all mathematicians were always worried about foundations and always used those particular definitions, because that obviously isn’t the operation the other Steve is looking for.
Apologies in advance... the following is going to contain mathematical jargon and pedantry. Run now while you still can *wink*
On Tue, Mar 24, 2020 at 12:56:55AM -0700, Andrew Barnert wrote:
On Mar 23, 2020, at 19:52, Steven D'Aprano steve@pearwood.info wrote:
On Mon, Mar 23, 2020 at 06:03:06PM -0700, Andrew Barnert wrote: The existing methods are named issubset and issuperset (and isdisjoint, which doesn’t have an operator near-equivalent). Given that, would you still want equals instead of isequal or something?
Oops! I mean, aha, you passed my test to see if you were paying attention, well done!
*wink*
I would be satisfied by "isequal", if the method were needed.
So, I’d rather have an uglier, more explicit, and more obviously specific-to-set name like iscoextensive. Sure, not everyone will know what “coextensive” means
That's an unnecessary use of jargon
As far as I’m aware (and your MathWorld search bears this out), “coextensive” isn’t mathematical jargon.
No, it's definitely mathematical jargon, but *really* specialised. As far as I can tell, it's only used by people working on the foundations of logic and set theory. I haven't come across it elsewhere.
For example:
https://books.google.com.au/books?id=wEa9DwAAQBAJ&pg=PA284&lpg=PA284...
"Introduction To Discrete Mathematics Via Logic And Proof" by Calvin Jongsma, on pages 283-4, it says:
First-Order Logic already has a fixed, standard notation of identity governed by rules of inference. S = T logically implies (∀x)(x∈S ↔ x∈T) [...] Thus identical sets must contain exactly the same elements. However we can't turn this claim around and say that sets having the same elements are identical -- *set equality for coextensive sets doesn't follow from logic alone.* We'll therefore postulate this as an axiom, using the formal notation of FOL.
Axiom 5.3.1: Axiom of Extensionality ∀x(x∈S ↔ x∈T) → S = T
The crierion for equal sets (see Definition 4.1.1 follows immediately:
Proposition 5.3.1: Equal Sets S = T ↔ ∀x(x∈S ↔ x∈T)
Apologies to everyone who got lost reading that :-)
In the above quote, "identity" should be read as "equality". The emphasised clause "set equality for coextensive sets..." is in the original. The upside down A ∀ means "for all", the rounded E ∈ means "element of".
So translated into English, what the author is saying is that he has a concept of two sets being equal, S = T. He has another concept of two sets being coextensive, namely, that for each element in S, it is also in T, and vice versa. He then says that if two sets are identical (equal), then logically they must also be coextensive, but to go the other way (coextensive implies equality) we have to make it part of the definition.
That's a long-winded, pedantic and precise way of saying that two sets are equal if, and only if, they have precisely the same elements as each other, by definition.
I chose it because of its ordinary (if not super-common) English meaning.
Oh.
Well, you outsmarted me, or perhaps outdumbed me *smiles* because it definitely is a term from mathematics, whether you knew it or not.
I thought that you were referring to the mathematical usage.
As far as the non-mathematical meaning:
being of equal extent or scope or duration; having the same spatial limits or boundaries;
I think we would be justified as reading `A.iscoextensive(B)` in one of two ways:
len(A) == len(B) (min(A), max(A)) == (min(B), max(B))
but not necessarily as equal.
Why not equal? Consider somebody who started a project on the day of the Olypics Opening Ceremony, and finished the project on the day of the Closing Ceremony. We can say that the project and the Olympics were coextensive, but we can't necessarily say that they were equal or the same thing, or that the project was the Olympics.
[...]
Sure, but in math, just as in Python, a set is never equal to a list (with a tiny foundations asterisk that isn’t relevant, but I’ll mention it below).
Well, yes and no.
In Python, just as in mathematics, a set is never a superset or subset of a list either, and yet in Python we have a set method which quite happily says that a set can be a superset or subset (or equal to) a list:
{1, 2}.issubset([2, 1]) # Returns True
We understand that, unlike the operators `<` and `<=`, the issubset method is happy to (conceptually) coerce the list into a set.
So when mathematicians want to say a set is equal to a list… well, they just don’t say that, because it isn’t true, and usually isn’t even a meaningful question in the first place.
Don't you think that there is a sense in which the Natural numbers ℕ {0, 1, 2, 3, ...} and the sequence (0, 1, 2, 3, ...) are "the same thing"?
We even talk about ℕ having successor and predecessor functions, which implies a natural order to the set.
The foundations of mathematics is one of those things which are extremely over-represented on the Internet
Well, you’re the only one who brought up foundations here, and in the same email where you’re railing against people talking about foundations on the internet, so I’m not sure what the point is.
Because I genuinely thought you intended to refer to the mathematical jargon, I didn't imagine you just got lucky.
My guess is that you saw Greg, Marc-Andre, etc. talking about sets in mathematics and naturally thought “oh no, here comes irrelevant mathematical logic”,
*blink*
Um, are you confusing me with someone else? Is there something I've said that leads you to the conclusion that I think that being mathematical and logical is irrelevant?
but there isn’t any; the reason sets came up is that we’re talking about set operations on Python set objects, and it’s pretty hard to talk about what issubset means/should mean without talking about sets.
Oh well I'm glad you set me straight.
Hi Steven,
I think you are taking this a bit too far out of what we normal use Python for in real life :-)
The mathematical complication of not having
∀x(x∈S ↔ x∈T) → S = T
be a consequence of
S = T → (∀x)(x∈S ↔ x∈T)
which may sound weird to many people, originates in the fact that the above must also hold for infinite number of elements in a set, including uncountably infinite sets and sets which have such sets as elements (including possibly uncountably infinite recursion of such inclusions). You enter a world full of wonders when you start considering such things.
In Python, however, we typically always operate on finite sets. In such a world, the above is not a complication anymore. In fact, the C implementation uses:
|S| = |T| ∧ S ⊆ T → S = T
or written using the above form:
|S| = |T| ∧ ∀x(x∈S → x∈T)→ S = T
Cheers.
On 3/24/2020 11:19 AM, Steven D'Aprano wrote:
Apologies in advance... the following is going to contain mathematical jargon and pedantry. Run now while you still can *wink*
On Tue, Mar 24, 2020 at 12:56:55AM -0700, Andrew Barnert wrote:
On Mar 23, 2020, at 19:52, Steven D'Aprano steve@pearwood.info wrote:
On Mon, Mar 23, 2020 at 06:03:06PM -0700, Andrew Barnert wrote: The existing methods are named issubset and issuperset (and isdisjoint, which doesn’t have an operator near-equivalent). Given that, would you still want equals instead of isequal or something?
Oops! I mean, aha, you passed my test to see if you were paying attention, well done!
*wink*
I would be satisfied by "isequal", if the method were needed.
So, I’d rather have an uglier, more explicit, and more obviously specific-to-set name like iscoextensive. Sure, not everyone will know what “coextensive” means
That's an unnecessary use of jargon
As far as I’m aware (and your MathWorld search bears this out), “coextensive” isn’t mathematical jargon.
No, it's definitely mathematical jargon, but *really* specialised. As far as I can tell, it's only used by people working on the foundations of logic and set theory. I haven't come across it elsewhere.
For example:
https://books.google.com.au/books?id=wEa9DwAAQBAJ&pg=PA284&lpg=PA284...
"Introduction To Discrete Mathematics Via Logic And Proof" by Calvin Jongsma, on pages 283-4, it says:
First-Order Logic already has a fixed, standard notation of identity governed by rules of inference. S = T logically implies (∀x)(x∈S ↔ x∈T) [...] Thus identical sets must contain exactly the same elements. However we can't turn this claim around and say that sets having the same elements are identical -- *set equality for coextensive sets doesn't follow from logic alone.* We'll therefore postulate this as an axiom, using the formal notation of FOL. Axiom 5.3.1: Axiom of Extensionality ∀x(x∈S ↔ x∈T) → S = T The crierion for equal sets (see Definition 4.1.1 follows immediately: Proposition 5.3.1: Equal Sets S = T ↔ ∀x(x∈S ↔ x∈T)
Apologies to everyone who got lost reading that :-)
In the above quote, "identity" should be read as "equality". The emphasised clause "set equality for coextensive sets..." is in the original. The upside down A ∀ means "for all", the rounded E ∈ means "element of".
So translated into English, what the author is saying is that he has a concept of two sets being equal, S = T. He has another concept of two sets being coextensive, namely, that for each element in S, it is also in T, and vice versa. He then says that if two sets are identical (equal), then logically they must also be coextensive, but to go the other way (coextensive implies equality) we have to make it part of the definition.
That's a long-winded, pedantic and precise way of saying that two sets are equal if, and only if, they have precisely the same elements as each other, by definition.
I chose it because of its ordinary (if not super-common) English meaning.
Oh.
Well, you outsmarted me, or perhaps outdumbed me *smiles* because it definitely is a term from mathematics, whether you knew it or not.
I thought that you were referring to the mathematical usage.
As far as the non-mathematical meaning:
being of equal extent or scope or duration; having the same spatial limits or boundaries;
I think we would be justified as reading `A.iscoextensive(B)` in one of two ways:
len(A) == len(B) (min(A), max(A)) == (min(B), max(B))
but not necessarily as equal.
Why not equal? Consider somebody who started a project on the day of the Olypics Opening Ceremony, and finished the project on the day of the Closing Ceremony. We can say that the project and the Olympics were coextensive, but we can't necessarily say that they were equal or the same thing, or that the project was the Olympics.
[...]
Sure, but in math, just as in Python, a set is never equal to a list (with a tiny foundations asterisk that isn’t relevant, but I’ll mention it below).
Well, yes and no.
In Python, just as in mathematics, a set is never a superset or subset of a list either, and yet in Python we have a set method which quite happily says that a set can be a superset or subset (or equal to) a list:
{1, 2}.issubset([2, 1]) # Returns True
We understand that, unlike the operators `<` and `<=`, the issubset method is happy to (conceptually) coerce the list into a set.
So when mathematicians want to say a set is equal to a list… well, they just don’t say that, because it isn’t true, and usually isn’t even a meaningful question in the first place.
Don't you think that there is a sense in which the Natural numbers ℕ {0, 1, 2, 3, ...} and the sequence (0, 1, 2, 3, ...) are "the same thing"?
We even talk about ℕ having successor and predecessor functions, which implies a natural order to the set.
The foundations of mathematics is one of those things which are extremely over-represented on the Internet
Well, you’re the only one who brought up foundations here, and in the same email where you’re railing against people talking about foundations on the internet, so I’m not sure what the point is.
Because I genuinely thought you intended to refer to the mathematical jargon, I didn't imagine you just got lucky.
My guess is that you saw Greg, Marc-Andre, etc. talking about sets in mathematics and naturally thought “oh no, here comes irrelevant mathematical logic”,
*blink*
Um, are you confusing me with someone else? Is there something I've said that leads you to the conclusion that I think that being mathematical and logical is irrelevant?
but there isn’t any; the reason sets came up is that we’re talking about set operations on Python set objects, and it’s pretty hard to talk about what issubset means/should mean without talking about sets.
Oh well I'm glad you set me straight.
On 23/03/2020 23:49, Steven D'Aprano wrote:
On Mon, Mar 23, 2020 at 02:05:49PM +0000, Rob Cliffe via Python-ideas wrote:
s = set("a") t = list("aa") s.issubset(t)
True
s.issuperset(t)
True
but it would be misleading IMO to say that s and t are in some sense equal.
In *some* sense they are equal:
- every element in s is also in t;
- every element in t is also in s;
- no element in s is not in t;
- no element in t is not in s;
- modulo uniqueness, both s and t have the same elements;
- converting t to a set gives {'a'} which is equal to s.
I don't know that this is an *important* sense, but the OP Steve J isn't wrong to notice it.
I shouldn't need to say this, but for the record I am not proposing and do not want set equality to support lists; nor do I see the need for a new method to perform "equivalent to equality" tests; but if the consensus is that sets should have that method, I would prefer it to be given the simpler name:
set.superset # not .equivalent_to_superset set.subset # not .equivalent_to_subset set.equals # not some variation of .equivalent_to_equals
I could have expressed myself more clearly. What I mean is that if the new method accepts an arbitrary iterable, it should not have a name such as 'equals' or 'isequal' because that is just plain misleading - it should be called something less definite like 'iscoextensive' or [your suggestion here]. Conversely, if the new method only accepts a set as argument, it's fine to call it 'isequal' or whatever, because when it returns True the two sets *will* be equal. Rob Cliffe
Steven D'Aprano wrote:
On Mon, Mar 23, 2020 at 12:03:50AM -0000, Steve Jorgensen wrote:
Every set is a superset of itself and a subset of itself. A set may not be a "formal" subset or a "formal" superset of itself. issubset and issuperset refer to standard subsets and supersets, not formal subsets and supersets. Sorry, I don't understand your terminology "formal" and "standard". I
think you might mean "proper" rather than formal? But I don't know what you mean by "standard".
<snip>
Right. I meant "proper". Not "formal". By "standard", I simply mean without the "proper" qualifier.
Hi,
I can't believe we've gone this far:
s = set("a") t = list("aa") s.issubset(t)
True
s.issuperset(t)
True
In *some* sense they are equal:
[demonstration elided]
I don't know that this is an *important* sense, but the OP Steve J isn't wrong to notice it.
Not wrong to notice, but we already have
s <= set(t) s >= set(t) s == set(t)
as well as s < set(t) and s > set(t), and even
set(s) == set(t)
etc., where *both* s and t are arbitrary iterables.
There's no point to a special method to use for iterables, because you have to decide to use that method based on knowing something is an iterable but not a set. If you're forced to decide to use a special method[1], you can also decide to coerce to set, because set() already accepts an arbitrary iterable. "Explicit is better than implicit" rules in this case, I think.
If you don't have special methods, you are generalizing existing set methods to arbitrary iterables, which gives such gems as (the actual Python code, not pseudo-code for "equivalent")
assert set("a") == list("aa")
but presumably
assert not (list("a") == set("aa"))
which is horrible on so many levels, mathematical and otherwise.
The only possible argument I can see is performance. But even this isn't completely obvious (for worst-case), since I suppose that the iteration to compare two sets is very fast, I would guess a memcmp, while the membership tests needed would be equivalent to constructing the set (i.e., deduplication). On average you get somewhat better performance by short-circuiting when the membership test(s) fail, I suppose.
Footnotes: [1] Note that if "s.issubset(t)" is made to behave differently from "s <= t", that's a special method by my definition because you must decide which is appropriate in any given comparison.