Mailman 3 Support multiplication for sets - Python-ideas

Support multiplication for sets

older
Testing Key-Value Membership In...

Jakob Bowyer

Oct. 7, 2011

5:24 a.m.

Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation. This would make set usage more intuitive for example; (assuming python3) a = set(["amy", "martin"]) b = set(["smith", "jones", "john"]) c = a * b print(c) set([('john', 'jones'), ('john', 'martin'), ('jones', 'john'), ('martin', 'amy'), ....]) This could be really easily achieved by giving a __mul__ method for sets. Currently trying to multiply sets gives a TypeError. Anyone got any views on this? Or am I barking up the wrong tree and saying something stupid.

Attachments:

attachment.htm (text/html — 997 bytes)

Show replies by date

Paul Moore

October 2011

5:35 a.m.

On 7 October 2011 11:24, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...

Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation. This would make set usage more intuitive for example; (assuming python3)

itertools.product does what you want already.

...

...
...
a = set((1,2,3)) b = set((4,5,6)) set(itertools.product(a,b)) {(2, 6), (1, 4), (1, 5), (1, 6), (3, 6), (2, 5), (3, 4), (2, 4), (3, 5)}

I don't think an operator form is sufficiently valuable when the functionality is available and clear enough already. Paul.

Jakob Bowyer

5:37 a.m.

There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background. On Fri, Oct 7, 2011 at 11:35 AM, Paul Moore <p.f.moore@gmail.com> wrote:

...

On 7 October 2011 11:24, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...
Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation. This would make set usage more intuitive for example; (assuming python3)

itertools.product does what you want already.

...
...
...
a = set((1,2,3)) b = set((4,5,6)) set(itertools.product(a,b)) {(2, 6), (1, 4), (1, 5), (1, 6), (3, 6), (2, 5), (3, 4), (2, 4), (3, 5)}

I don't think an operator form is sufficiently valuable when the functionality is available and clear enough already.

Paul.

Paul Moore

6:20 a.m.

On 7 October 2011 11:37, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...

There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background.

I'm not sure I'd agree, even though I come from a maths background. Explicit is better than implicit and all that... Even if it is slightly clearer to some people, I bet there are others (not from a mathematical background) who would be confused by it. And in that case, itertools.product is easier to google for than "*"...) And that's ignoring the cost of implementing, testing, documenting the change. Actually, just to give a flavour of the sorts of design decisions that would need to be considered, consider this:

...

...
...
a = set((1,2)) b = set((3,4)) c = set((5,6)) from itertools import product def times(s1,s2): ... return set(product(s1,s2)) ... times(a,times(b,c)) {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, (3, 6)), (2, (4, 6)), (1, (3, 5))} times(times(a,b),c) {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, 3), 5), ((2, 4), 5), ((1, 3), 5)}

So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different). That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work? The problem very quickly becomes a lot larger than you first assume. Operator overloading is used much more sparingly in Python than in, say, C++. It's as much a language style issue as anything else. Sorry, but I still don't see enough benefit to justify this. Paul.

Jakob Bowyer

6:41 a.m.

Considering any multiplication action on a set is illegal. I don't think it will confuse anyone who doesn't know what a set is mathematically. On Fri, Oct 7, 2011 at 12:20 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...

On 7 October 2011 11:37, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...
There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background.

I'm not sure I'd agree, even though I come from a maths background. Explicit is better than implicit and all that...

Even if it is slightly clearer to some people, I bet there are others (not from a mathematical background) who would be confused by it. And in that case, itertools.product is easier to google for than "*"...) And that's ignoring the cost of implementing, testing, documenting the change.

Actually, just to give a flavour of the sorts of design decisions that would need to be considered, consider this:

...
...
...
a = set((1,2)) b = set((3,4)) c = set((5,6)) from itertools import product def times(s1,s2): ... return set(product(s1,s2)) ... times(a,times(b,c)) {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, (3, 6)), (2, (4, 6)), (1, (3, 5))} times(times(a,b),c) {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, 3), 5), ((2, 4), 5), ((1, 3), 5)}

So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different). That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work? The problem very quickly becomes a lot larger than you first assume.

Operator overloading is used much more sparingly in Python than in, say, C++. It's as much a language style issue as anything else.

Sorry, but I still don't see enough benefit to justify this.

Paul.

Haoyi Li

10:28 a.m.

I don't think having itertools.product is a good reason for not overloading the operator. The same argument could be said against having listA + listB or listA * 10. After all, those can all be done with list comprehensions and itertools aswell. itertools.product(a, b) or a list comprehension work fine for 2 sets, but if you try doing that for any significant number (which i presume the OP is), maybe 5 set operations in one expression, it quickly becomes completely unreadable: itertools.product(itertools.product(seta, setb.union(setc), setd.difference(sete)) vs seta * (setb | setc) * (setd & sete) We already have operators overloaded for union |, intersect &, difference -, symmetric difference ^. Having an operator for product would fit in perfectly if it could be done properly; ensuring that it is commutative looks non-trivial. On Fri, Oct 7, 2011 at 7:41 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...

Considering any multiplication action on a set is illegal. I don't think it will confuse anyone who doesn't know what a set is mathematically. On Fri, Oct 7, 2011 at 12:20 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...
On 7 October 2011 11:37, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...
There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background.

I'm not sure I'd agree, even though I come from a maths background. Explicit is better than implicit and all that...

Even if it is slightly clearer to some people, I bet there are others (not from a mathematical background) who would be confused by it. And in that case, itertools.product is easier to google for than "*"...) And that's ignoring the cost of implementing, testing, documenting the change.

Actually, just to give a flavour of the sorts of design decisions that would need to be considered, consider this:

...
...
...
a = set((1,2)) b = set((3,4)) c = set((5,6)) from itertools import product def times(s1,s2): ... return set(product(s1,s2)) ... times(a,times(b,c)) {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, (3, 6)), (2, (4, 6)), (1, (3, 5))} times(times(a,b),c) {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, 3), 5), ((2, 4), 5), ((1, 3), 5)}

So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different). That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work? The problem very quickly becomes a lot larger than you first assume.

Operator overloading is used much more sparingly in Python than in, say, C++. It's as much a language style issue as anything else.

Sorry, but I still don't see enough benefit to justify this.

Paul.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Nick Coghlan

4:35 p.m.

On Fri, Oct 7, 2011 at 11:28 AM, Haoyi Li <haoyi.sg@gmail.com> wrote:

...

I don't think having itertools.product is a good reason for not overloading the operator. The same argument could be said against having listA + listB or listA * 10. After all, those can all be done with list comprehensions and itertools aswell.

itertools.product(a, b) or a list comprehension work fine for 2 sets, but if you try doing that for any significant number (which i presume the OP is), maybe 5 set operations in one expression, it quickly becomes completely unreadable:

itertools.product(itertools.product(seta, setb.union(setc), setd.difference(sete))

vs

seta * (setb | setc) * (setd & sete)

I expect what you actually intended here was: cp = product(seta, setb|setc, setd&sete) If you actually did want two distinct cartesian products, then the two line version is significantly easier to read: cp_a = product(setb|setc, setd&sete) cp_b = product(seta, cp_a) The ambiguity of chained multiplication is more than enough to kill the idea (although it was definitely worth asking the question). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Greg Ewing

10:58 p.m.

Nick Coghlan wrote:

...

I expect what you actually intended here was:

cp = product(seta, setb|setc, setd&sete)

If you actually did want two distinct cartesian products, then the two line version is significantly easier to read:

cp_a = product(setb|setc, setd&sete) cp_b = product(seta, cp_a)

This is another area where the comprehension syntax is very useful. It lets you specify exactly what structure you want for the results in a very natural and readable way. -- Greg

Stephen J. Turnbull

12:15 p.m.

Paul Moore writes:

...

So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different).

No, set multiplication is not going to be commutative: {'a'} X {1} is quite different from {1} X {'a'}. The problem you're pointing out is much worse: the obvious implementation of Cartesian product isn't even associative. -1 on an operator for this.

Paul Moore

1:13 p.m.

On 7 October 2011 18:15, Stephen J. Turnbull <turnbull@sk.tsukuba.ac.jp> wrote:

...

Paul Moore writes:

> So your multiplication isn't commutative (the types of the elements in > the 2 expressions above are different).

No, set multiplication is not going to be commutative: {'a'} X {1} is quite different from {1} X {'a'}. The problem you're pointing out is much worse: the obvious implementation of Cartesian product isn't even associative.

Bah. For all my claims of having a mathematical background (OK, so it was 30 years ago :-)) I can't even remember the difference between associativity and commutativity. I'm going for a lie down now... :-) Paul.

Jakob Bowyer

3:54 p.m.

Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind? On Fri, Oct 7, 2011 at 7:13 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...

On 7 October 2011 18:15, Stephen J. Turnbull <turnbull@sk.tsukuba.ac.jp> wrote:

...
Paul Moore writes:

...
So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different).

No, set multiplication is not going to be commutative: {'a'} X {1} is quite different from {1} X {'a'}. The problem you're pointing out is much worse: the obvious implementation of Cartesian product isn't even associative.

Bah. For all my claims of having a mathematical background (OK, so it was 30 years ago :-)) I can't even remember the difference between associativity and commutativity.

I'm going for a lie down now... :-) Paul.

Ethan Furman

3:57 p.m.

Jakob Bowyer wrote:

...

Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind?

Just what you did this time. As the tagline says, "In order for there to be good ideas, there must first be lots of ideas." (No, I don't remember who said it, or even who's tagline it is.) ~Ethan~

Mike Meyer

4:30 p.m.

On Fri, Oct 7, 2011 at 1:57 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

...

Jakob Bowyer wrote:

...
Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind?

Just what you did this time. As the tagline says, "In order for there to be good ideas, there must first be lots of ideas." (No, I don't remember who said it, or even who's tagline it is.)

Actually, that question makes me think that a meta-PEP on the process of vetting an idea before starting a real PEP might be worthwhile. In particular, that people should come up with real-world use cases and try and think of the worst abuses that might occur, etc. <mike

Nick Coghlan

4:37 p.m.

On Fri, Oct 7, 2011 at 5:30 PM, Mike Meyer <mwm@mired.org> wrote:

...

Actually, that question makes me think that a meta-PEP on the process of vetting an idea before starting a real PEP might be worthwhile. In particular, that people should come up with real-world use cases and try and think of the worst abuses that might occur, etc.

The process is basically 'ask on python-ideas if a web search doesn't show that it has already been proposed'. The next step will vary from "not going to happen" through "file a feature request on the tracker" to "write a PEP". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Mike Meyer

4:54 p.m.

On Fri, Oct 7, 2011 at 2:37 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On Fri, Oct 7, 2011 at 5:30 PM, Mike Meyer <mwm@mired.org> wrote:

...
Actually, that question makes me think that a meta-PEP on the process of vetting an idea before starting a real PEP might be worthwhile. In particular, that people should come up with real-world use cases and try and think of the worst abuses that might occur, etc.

The process is basically 'ask on python-ideas if a web search doesn't show that it has already been proposed'. The next step will vary from "not going to happen" through "file a feature request on the tracker" to "write a PEP".

True, but not very useful. The idea would be to discuss what happens between posting and "the next step". People may well ask for use cases, look at abuses, do searches of the library for them (or ask you to do so), etc. <mike

Paul Moore

4:21 p.m.

On 7 October 2011 21:54, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...

Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind?

About the same, but be prepared for not everyone to be as enthusiastic about your idea as you are, and be open to the possibility that some of the objections are valid... (BTW, I thought I was offering you some insight into why ideas might not be as straightforward as the inventor thought. I apologise if it came across to you as me "shooting down in flames" your idea...) Paul.

Jakob Bowyer

4:23 p.m.

Actually I like this style :) don't mistake my "shooting down in flames" to be aggressive dislike. I have taken your specific advice on board, I was asking in a more general context. So I'm actually quite happy this worked out this way gives me a chance to learn On Fri, Oct 7, 2011 at 10:21 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...

On 7 October 2011 21:54, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...
Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind?

About the same, but be prepared for not everyone to be as enthusiastic about your idea as you are, and be open to the possibility that some of the objections are valid...

(BTW, I thought I was offering you some insight into why ideas might not be as straightforward as the inventor thought. I apologise if it came across to you as me "shooting down in flames" your idea...)

Paul.

Stephen J. Turnbull

9:30 p.m.

Jakob Bowyer writes:

...

Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind?

0. Be familiar with the Zen (both the official "python -m this" and Apocrypha, such as "not every 3-line function needs to be a builtin"). Try to see how they apply to discussions you read even when not explicitly mentioned. 1. Do check the archives, of this list and python-dev. There are some amazingly good teachers here. 2. If you're worried that the question might stupid or "obvious to the Dutch", you might float your trial balloon on python-list (aka comp.lang.python) first. 3. Make sure you know what the earlier problems with similar ideas were. At least that way you can often manage a soft landing. :-) 4. Don't let the experience stop you from trying again. There are no stupid questions -- except the unasked ones. (But see (2); maybe there's a more appropriate venue to ask the first time.)

alex23

1:23 a.m.

On Oct 8, 6:54 am, Jakob Bowyer <jkb...@gmail.com> wrote:

...

Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind?

An implementation is always helpful. It doesn't need to be complete but it will certainly help you iron out the idea and give everyone else something concrete to discuss. If it's possible/relevant to the suggestion, putting it up as a package on PyPI is also a good way to gauge usefulness.

Nick Coghlan

4:29 p.m.

On Fri, Oct 7, 2011 at 2:13 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...

On 7 October 2011 18:15, Stephen J. Turnbull <turnbull@sk.tsukuba.ac.jp> wrote:

...
Paul Moore writes:

> So your multiplication isn't commutative (the types of the elements in > the 2 expressions above are different).

No, set multiplication is not going to be commutative: {'a'} X {1} is quite different from {1} X {'a'}. The problem you're pointing out is much worse: the obvious implementation of Cartesian product isn't even associative.

Bah. For all my claims of having a mathematical background (OK, so it was 30 years ago :-)) I can't even remember the difference between associativity and commutativity.

Associative: A * (B * C) == (A * B) * C Commutative: A * B == B * A Set multiplication as the Cartesian product is neither (as others have pointed out). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Mike Meyer

4:51 p.m.

On Fri, Oct 7, 2011 at 2:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On Fri, Oct 7, 2011 at 2:13 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...
On 7 October 2011 18:15, Stephen J. Turnbull <turnbull@sk.tsukuba.ac.jp> wrote:

...
Paul Moore writes:

...
So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different).

No, set multiplication is not going to be commutative: {'a'} X {1} is quite different from {1} X {'a'}. The problem you're pointing out is much worse: the obvious implementation of Cartesian product isn't even associative.

Bah. For all my claims of having a mathematical background (OK, so it was 30 years ago :-)) I can't even remember the difference between associativity and commutativity.

Associative: A * (B * C) == (A * B) * C Commutative: A * B == B * A

Associative is a problem. Especially because there are two reasonable interpretations of it ( (a, (b, c)) vs. (a, b, c)). Commutative, not so much. We already have a number of non-commutative operators (-, /, // and % on most types), some of which behave that way in the real world of mathematics. We also have operators that are commutative on integers but not on other things (+ on lists and string). <mike

Terry Reedy

7:57 p.m.

On 10/7/2011 7:20 AM, Paul Moore wrote:

...

On 7 October 2011 11:37, Jakob Bowyer<jkbbwr-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

...
There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background.

Math used a different symbol, an 'X' without serifs, for cross-products. The result is a set of 'ordered pairs', which is different from a duple.

...

I'm not sure I'd agree, even though I come from a maths background. Explicit is better than implicit and all that...

Even if it is slightly clearer to some people, I bet there are others (not from a mathematical background) who would be confused by it. And in that case, itertools.product is easier to google for than "*"...) And that's ignoring the cost of implementing, testing, documenting the change.

Actually, just to give a flavour of the sorts of design decisions that would need to be considered, consider this:

...
...
...
a = set((1,2)) b = set((3,4)) c = set((5,6)) from itertools import product def times(s1,s2): ... return set(product(s1,s2)) ... times(a,times(b,c)) {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, (3, 6)), (2, (4, 6)), (1, (3, 5))} times(times(a,b),c) {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, 3), 5), ((2, 4), 5), ((1, 3), 5)}

So your multiplication isn't commutative

It is not *associative* -- unless one special-cases tuples to add non-tuple elements to tuple elements.

...

(the types of the elements in the 2 expressions above are different).

If the elements of A*B are sequences, then A*B is also not commutative. But it would be if the elements were sets instead of pairs.

...

That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work?

In math, *...* is a ternary operator with 3 args, like if...else in Python or ;...: in C, but it generalizes to an n-ary operator. From https://secure.wikimedia.org/wikipedia/en/wiki/Cartesian_product ''' The Cartesian product can be generalized to the n-ary Cartesian product over n sets X1, ..., Xn: X_1\times\cdots\times X_n = \{(x_1, \ldots, x_n) : x_i \in X_i \}. It is a set of n-tuples. If tuples are defined as nested ordered pairs, it can be identified to (X1 × ... × Xn-1) × Xn.''' In other words, think of aXbXc as XX(a,b,c) and similarly for more Xs. In Python, better to define XX explicitly. One can even write the n-fold generalization by simulating n nested for loops.

...

The problem very quickly becomes a lot larger than you first assume.

Operator overloading is used much more sparingly in Python than in, say, C++. It's as much a language style issue as anything else.

Sorry, but I still don't see enough benefit to justify this.

In most situations, one really needs an iterator that produces the pairs one at a time rather than a complete collection. And when one does want a complete collection, one might often want it ordered as a list rather than unordered as a set. Itertools.product covers all these use cases. And even that does not cover the n-ary case. For many combinatorial algorithms, one need a 'cross-concatenation' operator defined on collections of sequences which adds the pair of sequences in each cross-product pair. The same 'x' symbol, possibly circled, has been used for that. Both are in unicode, but Python is currently restricted to a sparse set of ascii symbols. -- Terry Jan Reedy

Terry Reedy

8:04 p.m.

On 10/7/2011 8:57 PM, Terry Reedy wrote:

...

On 10/7/2011 7:20 AM, Paul Moore wrote:

...
On 7 October 2011 11:37, Jakob Bowyer<jkbbwr@gmail.com> wrote:

...
There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background.

Math used a different symbol, an 'X' without serifs, for cross-products. The result is a set of 'ordered pairs', which is different from a duple.

...
I'm not sure I'd agree, even though I come from a maths background. Explicit is better than implicit and all that...

Even if it is slightly clearer to some people, I bet there are others (not from a mathematical background) who would be confused by it. And in that case, itertools.product is easier to google for than "*"...) And that's ignoring the cost of implementing, testing, documenting the change.

Actually, just to give a flavour of the sorts of design decisions that would need to be considered, consider this:

...
...
...
a = set((1,2)) b = set((3,4)) c = set((5,6)) from itertools import product def times(s1,s2): ... return set(product(s1,s2)) ... times(a,times(b,c)) {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, (3, 6)), (2, (4, 6)), (1, (3, 5))} times(times(a,b),c) {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, 3), 5), ((2, 4), 5), ((1, 3), 5)}

So your multiplication isn't commutative

It is not *associative* -- unless one special-cases tuples to add non-tuple elements to tuple elements.

...
(the types of the elements in the 2 expressions above are different).

If the elements of A*B are sequences, then A*B is also not commutative. But it would be if the elements were sets instead of pairs.

...
That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work?

In math, *...* is a ternary operator with 3 args, like if...else in Python or ;...: in C, but it generalizes to an n-ary operator. From https://secure.wikimedia.org/wikipedia/en/wiki/Cartesian_product ''' The Cartesian product can be generalized to the n-ary Cartesian product over n sets X1, ..., Xn:

X_1\times\cdots\times X_n = \{(x_1, \ldots, x_n) : x_i \in X_i \}.

It is a set of n-tuples. If tuples are defined as nested ordered pairs, it can be identified to (X1 × ... × Xn-1) × Xn.'''

In other words, think of aXbXc as XX(a,b,c) and similarly for more Xs. In Python, better to define XX explicitly. One can even write the n-fold generalization by simulating n nested for loops.

And itertools.product already does this.

...

...
The problem very quickly becomes a lot larger than you first assume.

Operator overloading is used much more sparingly in Python than in, say, C++. It's as much a language style issue as anything else.

Sorry, but I still don't see enough benefit to justify this.

In most situations, one really needs an iterator that produces the pairs one at a time rather than a complete collection. And when one does want a complete collection, one might often want it ordered as a list rather than unordered as a set. Itertools.product covers all these use cases. And even that does not cover the n-ary case.

And itertools.product *does* cover the n-ary case. Sorry for the apparent error.

...

For many combinatorial algorithms, one need a 'cross-concatenation' operator defined on collections of sequences which adds the pair of sequences in each cross-product pair. The same 'x' symbol, possibly circled, has been used for that. Both are in unicode, but Python is currently restricted to a sparse set of ascii symbols.

-- Terry Jan Reedy

Stephen J. Turnbull

9:58 p.m.

Terry Reedy writes:

...

If the elements of A*B are sequences, then A*B is also not commutative. But it would be if the elements were sets instead of pairs.

But that's not a Cartesian product. By definition in a Cartesian product order of element components matters. I don't think I've ever seen a set product like that, and have trouble imagining applications for it unmodified (typically when squaring a set the diagonal would cause problems).

...

In math, *...* is a ternary operator with 3 args, like if...else in Python or ;...: in C, but it generalizes to an n-ary operator.

A better analogy is to the comma or string concatenation. I don't know if that would lead to an associative implementation, though.

...

In most situations, one really needs an iterator that produces the pairs one at a time rather than a complete collection.

I don't see why you couldn't have an operator on two iterables that produces an iterator. But of course comprehension notation is hard to beat for that.

...

And when one does want a complete collection, one might often want it ordered as a list rather than unordered as a set.

I don't understand this. Sets are unordered; any order you impose on the product would be arbitrary. So iterate the product as a set, what else might be (commonly) wanted?

Georg Brandl

2:42 a.m.

Am 08.10.2011 02:57, schrieb Terry Reedy:

...

On 10/7/2011 7:20 AM, Paul Moore wrote:

...
On 7 October 2011 11:37, Jakob Bowyer<jkbbwr@gmail.com> wrote:

...
There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background.

Math used a different symbol, an 'X' without serifs, for cross-products. The result is a set of 'ordered pairs', which is different from a duple.

While I understand the rest of your post, this made me wonder: what is the difference between an ordered pair and a 2-tuple? Georg

Terry Reedy

12:20 a.m.

On 10/8/2011 3:42 AM, Georg Brandl wrote:

...

Am 08.10.2011 02:57, schrieb Terry Reedy:

...
On 10/7/2011 7:20 AM, Paul Moore wrote:

...
On 7 October 2011 11:37, Jakob Bowyer<jkbbwr@gmail.com> wrote:

...
There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background.

Math used a different symbol, an 'X' without serifs, for cross-products. The result is a set of 'ordered pairs', which is different from a duple.

While I understand the rest of your post, this made me wonder: what is the difference between an ordered pair and a 2-tuple?

I could just say that they are different but nearly synonymous words used in different fields. A more precise answer depends on the meaning chosen for 'ordered pair' and '2-tuple'. If one takes 'ordered pair' as an undefined primitive (and generic) concept, then '2-tuple' is one specialization of the concept. https://secure.wikimedia.org/wikipedia/en/wiki/Ordered_pair says "Ordered pairs are also called 2-tuples, 2-dimensional vectors, or sequences of length 2." I take that as meaning that the latter three are specializations, as 'tuple' is definitely not the same as 'vector'. If one takes 'ordered pair' as a specialized set, then they different for a different reason. Tuple is not a subclass of set, at least not in Python. In practice, the two classes often have different interfaces. The two members of ordered pairs are the first and second. They are extracted by two different functions. Lisp cons cells with car (first) and cdr (rest) functions are an example. The two members of 2-tuples are also the 0-1 or 1-2 members and are usually extracted by indexing, which is one function taking two parameters. Python duples with 0-1 indexing are an example. -- Terry Jan Reedy

Calvin Spealman

10:55 a.m.

On Fri, Oct 7, 2011 at 6:37 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...

There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to **people from outside of a programming background**.

(emphasis added) These are not the only people writing and reading code, and decisions about syntax should favor improving the readability for coders across the board, not simply a single subset of them.

...

On Fri, Oct 7, 2011 at 11:35 AM, Paul Moore <p.f.moore@gmail.com> wrote:

...
On 7 October 2011 11:24, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...
Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation. This would make set usage more intuitive for example; (assuming python3)

itertools.product does what you want already.

...
...
...
a = set((1,2,3)) b = set((4,5,6)) set(itertools.product(a,b)) {(2, 6), (1, 4), (1, 5), (1, 6), (3, 6), (2, 5), (3, 4), (2, 4), (3, 5)}

I don't think an operator form is sufficiently valuable when the functionality is available and clear enough already.

Paul.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

-- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy

Mike Meyer

11:08 a.m.

On Fri, Oct 7, 2011 at 8:55 AM, Calvin Spealman <ironfroggy@gmail.com>wrote:

...

On Fri, Oct 7, 2011 at 6:37 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...
There is that but from a math point of view the syntax a * b does make

...
Its slightly clearer and makes more sense to **people from outside of a

sence. programming background**.

(emphasis added)

These are not the only people writing and reading code, and decisions about syntax should favor improving the readability for coders across the board, not simply a single subset of them.

True. But unless there's another common meaning for multiplying sets, there are only two groups of people to consider: Those who know it as the cross product, and those who have no idea what it might mean. They former will be surprised by the current situation when it doesn't work, the latter will have to look it up when they run into it. it's not really any worse than using + for string concatenation. Except for the associativity issue. <mike

Greg Ewing

4:53 p.m.

Mike Meyer wrote:

...

But unless there's another common meaning for multiplying sets,

While there might not be another common meaning for multiplying sets, it's not necessarily obvious that '*' applied to sets in a programming language means 'multiplication'. For example, Pascal uses '+' and '*' for set union and intersection, IIRC. -- Greg

Greg Ewing

4:33 p.m.

Jakob Bowyer wrote:

...

There is that but from a math point of view the syntax a * b does make sence.

A problem with this is that it doesn't generalise smoothly to products of more than two sets. A mathematician would think of A x B x C as a set of 3-tuples, but in Python, A * B * C implemented the straightforward way would give you a set of 2-tuples, one element of which is another 2-tuple. Keeping it as a function allows products of arbitrarily many sets to be expressed naturally. -- Greg

Steven D'Aprano

7:23 p.m.

Jakob Bowyer wrote:

...

There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background.

I realise that the consensus is that the lack of associativity is a fatal problem with a Cartesian product operator, but there are at least two other issues I haven't seen. (1) "Using * for set product makes sense to mathematicians" -- maybe so, but those mathematicians already have to learn to use | instead of ∪ (union) and & instead of ∩ (intersection), so learning to use itertools.product() for Cartesian product is not a major burden for them. (2) Cartesian product is potentially very expensive. The Cartesian product of a moderate-sized set and another moderate-sized set could turn out to be a HUGE set. This is not a fatal objection, since other operations in Python are potentially expensive: alist*10000000 but at least it looks expensive. You're multiplying by a big number, of course you're going to require a lot of memory. But set multiplication can very easily creep up on you: aset*bset will have size len(aset)*len(bset) which may be huge even if neither set on their own is. Better to keep it as a lazy iterator rather than try to generate a potentially huge set in one go. -- Steven

MRAB

7:39 p.m.

On 08/10/2011 01:23, Steven D'Aprano wrote:

...

Jakob Bowyer wrote:

...
There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background.

I realise that the consensus is that the lack of associativity is a fatal problem with a Cartesian product operator, but there are at least two other issues I haven't seen.

(1) "Using * for set product makes sense to mathematicians" -- maybe so, but those mathematicians already have to learn to use | instead of ∪ (union) and & instead of ∩ (intersection), so learning to use itertools.product() for Cartesian product is not a major burden for them.

[snip] Not to mention = and ==.

Guido van Rossum

8:02 p.m.

On Fri, Oct 7, 2011 at 5:23 PM, Steven D'Aprano <steve@pearwood.info> wrote:

...

(2) Cartesian product is potentially very expensive. The Cartesian product of a moderate-sized set and another moderate-sized set could turn out to be a HUGE set.

This is not a fatal objection, since other operations in Python are potentially expensive:

alist*10000000

but at least it looks expensive. You're multiplying by a big number, of course you're going to require a lot of memory. But set multiplication can very easily creep up on you:

aset*bset

will have size len(aset)*len(bset) which may be huge even if neither set on their own is. Better to keep it as a lazy iterator rather than try to generate a potentially huge set in one go.

I'm not defending the Cartesian product proposal, but this argument is just silly. What if the first example was written alist * n ? Does that look expensive? -- --Guido van Rossum (python.org/~guido)

Steven D'Aprano

9:13 p.m.

Guido van Rossum wrote:

...

On Fri, Oct 7, 2011 at 5:23 PM, Steven D'Aprano <steve@pearwood.info> wrote:

...
(2) Cartesian product is potentially very expensive. The Cartesian product of a moderate-sized set and another moderate-sized set could turn out to be a HUGE set.

This is not a fatal objection, since other operations in Python are potentially expensive:

alist*10000000

but at least it looks expensive. You're multiplying by a big number, of course you're going to require a lot of memory. But set multiplication can very easily creep up on you:

aset*bset

will have size len(aset)*len(bset) which may be huge even if neither set on their own is. Better to keep it as a lazy iterator rather than try to generate a potentially huge set in one go.

I'm not defending the Cartesian product proposal, but this argument is just silly. What if the first example was written

alist * n

? Does that look expensive?

I didn't say it was a *good* argument <wink> I already acknowledged that there are expensive operations in Python, and some of them are done by operators. Perhaps I'm just over-sensitive to the risk of large Cartesian products, having locked up my desktop with a foolish list(product(seta, setb)) in exactly the circumstances I described above: both sets were moderate sizes, and it never dawned on me until my PC ground to a halt that the product would be so much bigger. (I blame myself for this error: I should know better than to carelessly pass an iterator to list without thinking, which is exactly what I did.) In my experience, most uses of list multiplication look something like this: [0]*len(arg) which is not huge except in the extreme case that arg is already huge. But the typical use of set multiplication is surely going to be something like: arg1*arg2 which may be huge even if neither of the args are. I don't think Cartesian product is important enough, or fundamental enough, to justify making it easier to inadvertently generate a huge set by mistake. That was all I tried to say. -- Steven

Dirkjan Ochtman

5:43 a.m.

On Fri, Oct 7, 2011 at 12:35, Paul Moore <p.f.moore@gmail.com> wrote:

...

I don't think an operator form is sufficiently valuable when the functionality is available and clear enough already.

I'm not sure; if set multiplication is highly unambiguous (i.e. the Cartesian product is the only logical outcome, and there is not some other common multiplication-like operation on sets), than it seems to me that supporting the multiplication operator for the Cartesian product of sets would be sensible. Cheers, Dirkjan

Jakob Bowyer

5:46 a.m.

As far as I know and from asking my lecturer, multiplication only produces Cartesian products. On Fri, Oct 7, 2011 at 11:43 AM, Dirkjan Ochtman <dirkjan@ochtman.nl> wrote:

...

On Fri, Oct 7, 2011 at 12:35, Paul Moore <p.f.moore@gmail.com> wrote:

...
I don't think an operator form is sufficiently valuable when the functionality is available and clear enough already.

I'm not sure; if set multiplication is highly unambiguous (i.e. the Cartesian product is the only logical outcome, and there is not some other common multiplication-like operation on sets), than it seems to me that supporting the multiplication operator for the Cartesian product of sets would be sensible.

Cheers,

Dirkjan

Antoine Pitrou

6:07 a.m.

On Fri, 7 Oct 2011 11:46:34 +0100 Jakob Bowyer <jkbbwr@gmail.com> wrote:

...

As far as I know and from asking my lecturer, multiplication only produces Cartesian products.

Given that multiplying a list or tuple repeats the sequence, there may be a certain amount of confusion. Also, I don't think itertools.product is common enough to warrant an operator. There's a very readable alternative:

...

...
...
a = {"amy", "martin"} b = {"smith", "jones", "john"} {(u, v) for u in a for v in b} {('amy', 'john'), ('amy', 'jones'), ('martin', 'jones'), ('martin', 'smith'), ('martin', 'john'), ('amy', 'smith')}

Or, in the case where you only want to iterate, two nested loops will suffice and avoid building the container. Regards Antoine.

Amaury Forgeot d'Arc

10:36 a.m.

2011/10/7 Jakob Bowyer <jkbbwr@gmail.com>:

...

Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation. This would make set usage more intuitive for example; (assuming python3)

a = set(["amy", "martin"]) b = set(["smith", "jones", "john"]) c = a * b print(c) set([('john', 'jones'), ('john', 'martin'), ('jones', 'john'), ('martin', 'amy'), ....])

Is your example correct? It does not look like a cartesian product to me. and what about writing it this way: {(x,y) for x in a for y in b} -- Amaury Forgeot d'Arc

Mike Graham

12:08 p.m.

On Fri, Oct 7, 2011 at 6:24 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:

...

Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation. This would make set usage more intuitive for example; (assuming python3)

a = set(["amy", "martin"]) b = set(["smith", "jones", "john"]) c = a * b print(c)

set([('john', 'jones'), ('john', 'martin'), ('jones', 'john'), ('martin', 'amy'), ....])

This could be really easily achieved by giving a __mul__ method for sets. Currently trying to multiply sets gives a TypeError. Anyone got any views on this? Or am I barking up the wrong tree and saying something stupid.

This idea might be aesthetically pleasing from a mathematical viewpoint, but it does not help people write better programs. It does not provide anything better than the status quo. In fact, it adds an obscure behavior that needs to be maintained, taught, and understood, making Python ever-so-slightly worse. -1 Mike

Raymond Hettinger

12:59 p.m.

On Oct 7, 2011, at 6:24 AM, Jakob Bowyer wrote:

...

Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation.

-1 We already have multiple ways to do it (set comprehensions, itertools.product, ...). Also, it's much nicer to have an iterator than to fill memory with lots of little sets. Also, it is unclear what s*s*s should do. Probably, the user would expect {(a,a,a), (a,a,b), ..} but the way you've proposed it, they would get {((a,a),a), ((a,a),b), ...} and have an unpleasant surprise. Raymond

4886

Age (days ago)

4889

Last active (days ago)

List overview

Download

39 comments

21 participants

participants (21)

alex23
Amaury Forgeot d'Arc
Antoine Pitrou
Calvin Spealman
Dirkjan Ochtman
Ethan Furman
Georg Brandl
Greg Ewing
Guido van Rossum
Haoyi Li
Jakob Bowyer
Mike Graham
Mike Meyer
MRAB
Nick Coghlan
Paul Moore
Raymond Hettinger
Stephen J. Turnbull
Stephen J. Turnbull
Steven D'Aprano
Terry Reedy

Support multiplication for sets

Jakob Bowyer

Jakob Bowyer

Jakob Bowyer

Haoyi Li

Jakob Bowyer

Jakob Bowyer

Jakob Bowyer

tags

participants (21)