Support multiplication for sets

Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation. This would make set usage more intuitive for example; (assuming python3) a = set(["amy", "martin"]) b = set(["smith", "jones", "john"]) c = a * b print(c) set([('john', 'jones'), ('john', 'martin'), ('jones', 'john'), ('martin', 'amy'), ....]) This could be really easily achieved by giving a __mul__ method for sets. Currently trying to multiply sets gives a TypeError. Anyone got any views on this? Or am I barking up the wrong tree and saying something stupid.

On 7 October 2011 11:37, Jakob Bowyer <jkbbwr@gmail.com> wrote:
I'm not sure I'd agree, even though I come from a maths background. Explicit is better than implicit and all that... Even if it is slightly clearer to some people, I bet there are others (not from a mathematical background) who would be confused by it. And in that case, itertools.product is easier to google for than "*"...) And that's ignoring the cost of implementing, testing, documenting the change. Actually, just to give a flavour of the sorts of design decisions that would need to be considered, consider this:
So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different). That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work? The problem very quickly becomes a lot larger than you first assume. Operator overloading is used much more sparingly in Python than in, say, C++. It's as much a language style issue as anything else. Sorry, but I still don't see enough benefit to justify this. Paul.

I don't think having itertools.product is a good reason for not overloading the operator. The same argument could be said against having listA + listB or listA * 10. After all, those can all be done with list comprehensions and itertools aswell. itertools.product(a, b) or a list comprehension work fine for 2 sets, but if you try doing that for any significant number (which i presume the OP is), maybe 5 set operations in one expression, it quickly becomes completely unreadable: itertools.product(itertools.product(seta, setb.union(setc), setd.difference(sete)) vs seta * (setb | setc) * (setd & sete) We already have operators overloaded for union |, intersect &, difference -, symmetric difference ^. Having an operator for product would fit in perfectly if it could be done properly; ensuring that it is commutative looks non-trivial. On Fri, Oct 7, 2011 at 7:41 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:

On Fri, Oct 7, 2011 at 11:28 AM, Haoyi Li <haoyi.sg@gmail.com> wrote:
I expect what you actually intended here was: cp = product(seta, setb|setc, setd&sete) If you actually did want two distinct cartesian products, then the two line version is significantly easier to read: cp_a = product(setb|setc, setd&sete) cp_b = product(seta, cp_a) The ambiguity of chained multiplication is more than enough to kill the idea (although it was definitely worth asking the question). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Paul Moore writes:
So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different).
No, set multiplication is not going to be commutative: {'a'} X {1} is quite different from {1} X {'a'}. The problem you're pointing out is much worse: the obvious implementation of Cartesian product isn't even associative. -1 on an operator for this.

On 7 October 2011 18:15, Stephen J. Turnbull <turnbull@sk.tsukuba.ac.jp> wrote:
Bah. For all my claims of having a mathematical background (OK, so it was 30 years ago :-)) I can't even remember the difference between associativity and commutativity. I'm going for a lie down now... :-) Paul.

Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind? On Fri, Oct 7, 2011 at 7:13 PM, Paul Moore <p.f.moore@gmail.com> wrote:

On Fri, Oct 7, 2011 at 1:57 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Actually, that question makes me think that a meta-PEP on the process of vetting an idea before starting a real PEP might be worthwhile. In particular, that people should come up with real-world use cases and try and think of the worst abuses that might occur, etc. <mike

On Fri, Oct 7, 2011 at 5:30 PM, Mike Meyer <mwm@mired.org> wrote:
The process is basically 'ask on python-ideas if a web search doesn't show that it has already been proposed'. The next step will vary from "not going to happen" through "file a feature request on the tracker" to "write a PEP". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Oct 7, 2011 at 2:37 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
True, but not very useful. The idea would be to discuss what happens between posting and "the next step". People may well ask for use cases, look at abuses, do searches of the library for them (or ask you to do so), etc. <mike

On 7 October 2011 21:54, Jakob Bowyer <jkbbwr@gmail.com> wrote:
About the same, but be prepared for not everyone to be as enthusiastic about your idea as you are, and be open to the possibility that some of the objections are valid... (BTW, I thought I was offering you some insight into why ideas might not be as straightforward as the inventor thought. I apologise if it came across to you as me "shooting down in flames" your idea...) Paul.

Actually I like this style :) don't mistake my "shooting down in flames" to be aggressive dislike. I have taken your specific advice on board, I was asking in a more general context. So I'm actually quite happy this worked out this way gives me a chance to learn On Fri, Oct 7, 2011 at 10:21 PM, Paul Moore <p.f.moore@gmail.com> wrote:

Jakob Bowyer writes:
0. Be familiar with the Zen (both the official "python -m this" and Apocrypha, such as "not every 3-line function needs to be a builtin"). Try to see how they apply to discussions you read even when not explicitly mentioned. 1. Do check the archives, of this list and python-dev. There are some amazingly good teachers here. 2. If you're worried that the question might stupid or "obvious to the Dutch", you might float your trial balloon on python-list (aka comp.lang.python) first. 3. Make sure you know what the earlier problems with similar ideas were. At least that way you can often manage a soft landing. :-) 4. Don't let the experience stop you from trying again. There are no stupid questions -- except the unasked ones. (But see (2); maybe there's a more appropriate venue to ask the first time.)

On Oct 8, 6:54 am, Jakob Bowyer <jkb...@gmail.com> wrote:
An implementation is always helpful. It doesn't need to be complete but it will certainly help you iron out the idea and give everyone else something concrete to discuss. If it's possible/relevant to the suggestion, putting it up as a package on PyPI is also a good way to gauge usefulness.

On Fri, Oct 7, 2011 at 2:13 PM, Paul Moore <p.f.moore@gmail.com> wrote:
Associative: A * (B * C) == (A * B) * C Commutative: A * B == B * A Set multiplication as the Cartesian product is neither (as others have pointed out). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Oct 7, 2011 at 2:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Associative is a problem. Especially because there are two reasonable interpretations of it ( (a, (b, c)) vs. (a, b, c)). Commutative, not so much. We already have a number of non-commutative operators (-, /, // and % on most types), some of which behave that way in the real world of mathematics. We also have operators that are commutative on integers but not on other things (+ on lists and string). <mike

On 10/7/2011 7:20 AM, Paul Moore wrote:
Math used a different symbol, an 'X' without serifs, for cross-products. The result is a set of 'ordered pairs', which is different from a duple.
It is not *associative* -- unless one special-cases tuples to add non-tuple elements to tuple elements.
(the types of the elements in the 2 expressions above are different).
If the elements of A*B are sequences, then A*B is also not commutative. But it would be if the elements were sets instead of pairs.
That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work?
In math, *...* is a ternary operator with 3 args, like if...else in Python or ;...: in C, but it generalizes to an n-ary operator. From https://secure.wikimedia.org/wikipedia/en/wiki/Cartesian_product ''' The Cartesian product can be generalized to the n-ary Cartesian product over n sets X1, ..., Xn: X_1\times\cdots\times X_n = \{(x_1, \ldots, x_n) : x_i \in X_i \}. It is a set of n-tuples. If tuples are defined as nested ordered pairs, it can be identified to (X1 × ... × Xn-1) × Xn.''' In other words, think of aXbXc as XX(a,b,c) and similarly for more Xs. In Python, better to define XX explicitly. One can even write the n-fold generalization by simulating n nested for loops.
In most situations, one really needs an iterator that produces the pairs one at a time rather than a complete collection. And when one does want a complete collection, one might often want it ordered as a list rather than unordered as a set. Itertools.product covers all these use cases. And even that does not cover the n-ary case. For many combinatorial algorithms, one need a 'cross-concatenation' operator defined on collections of sequences which adds the pair of sequences in each cross-product pair. The same 'x' symbol, possibly circled, has been used for that. Both are in unicode, but Python is currently restricted to a sparse set of ascii symbols. -- Terry Jan Reedy

Terry Reedy writes:
If the elements of A*B are sequences, then A*B is also not commutative. But it would be if the elements were sets instead of pairs.
But that's not a Cartesian product. By definition in a Cartesian product order of element components matters. I don't think I've ever seen a set product like that, and have trouble imagining applications for it unmodified (typically when squaring a set the diagonal would cause problems).
In math, *...* is a ternary operator with 3 args, like if...else in Python or ;...: in C, but it generalizes to an n-ary operator.
A better analogy is to the comma or string concatenation. I don't know if that would lead to an associative implementation, though.
In most situations, one really needs an iterator that produces the pairs one at a time rather than a complete collection.
I don't see why you couldn't have an operator on two iterables that produces an iterator. But of course comprehension notation is hard to beat for that.
And when one does want a complete collection, one might often want it ordered as a list rather than unordered as a set.
I don't understand this. Sets are unordered; any order you impose on the product would be arbitrary. So iterate the product as a set, what else might be (commonly) wanted?

On 10/8/2011 3:42 AM, Georg Brandl wrote:
I could just say that they are different but nearly synonymous words used in different fields. A more precise answer depends on the meaning chosen for 'ordered pair' and '2-tuple'. If one takes 'ordered pair' as an undefined primitive (and generic) concept, then '2-tuple' is one specialization of the concept. https://secure.wikimedia.org/wikipedia/en/wiki/Ordered_pair says "Ordered pairs are also called 2-tuples, 2-dimensional vectors, or sequences of length 2." I take that as meaning that the latter three are specializations, as 'tuple' is definitely not the same as 'vector'. If one takes 'ordered pair' as a specialized set, then they different for a different reason. Tuple is not a subclass of set, at least not in Python. In practice, the two classes often have different interfaces. The two members of ordered pairs are the first and second. They are extracted by two different functions. Lisp cons cells with car (first) and cdr (rest) functions are an example. The two members of 2-tuples are also the 0-1 or 1-2 members and are usually extracted by indexing, which is one function taking two parameters. Python duples with 0-1 indexing are an example. -- Terry Jan Reedy

On Fri, Oct 7, 2011 at 6:37 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:
There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to **people from outside of a programming background**.
(emphasis added) These are not the only people writing and reading code, and decisions about syntax should favor improving the readability for coders across the board, not simply a single subset of them.
-- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy

On Fri, Oct 7, 2011 at 8:55 AM, Calvin Spealman <ironfroggy@gmail.com>wrote:
True. But unless there's another common meaning for multiplying sets, there are only two groups of people to consider: Those who know it as the cross product, and those who have no idea what it might mean. They former will be surprised by the current situation when it doesn't work, the latter will have to look it up when they run into it. it's not really any worse than using + for string concatenation. Except for the associativity issue. <mike

Mike Meyer wrote:
But unless there's another common meaning for multiplying sets,
While there might not be another common meaning for multiplying sets, it's not necessarily obvious that '*' applied to sets in a programming language means 'multiplication'. For example, Pascal uses '+' and '*' for set union and intersection, IIRC. -- Greg

Jakob Bowyer wrote:
There is that but from a math point of view the syntax a * b does make sence.
A problem with this is that it doesn't generalise smoothly to products of more than two sets. A mathematician would think of A x B x C as a set of 3-tuples, but in Python, A * B * C implemented the straightforward way would give you a set of 2-tuples, one element of which is another 2-tuple. Keeping it as a function allows products of arbitrarily many sets to be expressed naturally. -- Greg

Jakob Bowyer wrote:
I realise that the consensus is that the lack of associativity is a fatal problem with a Cartesian product operator, but there are at least two other issues I haven't seen. (1) "Using * for set product makes sense to mathematicians" -- maybe so, but those mathematicians already have to learn to use | instead of ∪ (union) and & instead of ∩ (intersection), so learning to use itertools.product() for Cartesian product is not a major burden for them. (2) Cartesian product is potentially very expensive. The Cartesian product of a moderate-sized set and another moderate-sized set could turn out to be a HUGE set. This is not a fatal objection, since other operations in Python are potentially expensive: alist*10000000 but at least it looks expensive. You're multiplying by a big number, of course you're going to require a lot of memory. But set multiplication can very easily creep up on you: aset*bset will have size len(aset)*len(bset) which may be huge even if neither set on their own is. Better to keep it as a lazy iterator rather than try to generate a potentially huge set in one go. -- Steven

Guido van Rossum wrote:
I didn't say it was a *good* argument <wink> I already acknowledged that there are expensive operations in Python, and some of them are done by operators. Perhaps I'm just over-sensitive to the risk of large Cartesian products, having locked up my desktop with a foolish list(product(seta, setb)) in exactly the circumstances I described above: both sets were moderate sizes, and it never dawned on me until my PC ground to a halt that the product would be so much bigger. (I blame myself for this error: I should know better than to carelessly pass an iterator to list without thinking, which is exactly what I did.) In my experience, most uses of list multiplication look something like this: [0]*len(arg) which is not huge except in the extreme case that arg is already huge. But the typical use of set multiplication is surely going to be something like: arg1*arg2 which may be huge even if neither of the args are. I don't think Cartesian product is important enough, or fundamental enough, to justify making it easier to inadvertently generate a huge set by mistake. That was all I tried to say. -- Steven

On Fri, Oct 7, 2011 at 12:35, Paul Moore <p.f.moore@gmail.com> wrote:
I don't think an operator form is sufficiently valuable when the functionality is available and clear enough already.
I'm not sure; if set multiplication is highly unambiguous (i.e. the Cartesian product is the only logical outcome, and there is not some other common multiplication-like operation on sets), than it seems to me that supporting the multiplication operator for the Cartesian product of sets would be sensible. Cheers, Dirkjan

On Fri, 7 Oct 2011 11:46:34 +0100 Jakob Bowyer <jkbbwr@gmail.com> wrote:
As far as I know and from asking my lecturer, multiplication only produces Cartesian products.
Given that multiplying a list or tuple repeats the sequence, there may be a certain amount of confusion. Also, I don't think itertools.product is common enough to warrant an operator. There's a very readable alternative:
Or, in the case where you only want to iterate, two nested loops will suffice and avoid building the container. Regards Antoine.

On Fri, Oct 7, 2011 at 6:24 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:
This idea might be aesthetically pleasing from a mathematical viewpoint, but it does not help people write better programs. It does not provide anything better than the status quo. In fact, it adds an obscure behavior that needs to be maintained, taught, and understood, making Python ever-so-slightly worse. -1 Mike

On Oct 7, 2011, at 6:24 AM, Jakob Bowyer wrote:
Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation.
-1 We already have multiple ways to do it (set comprehensions, itertools.product, ...). Also, it's much nicer to have an iterator than to fill memory with lots of little sets. Also, it is unclear what s*s*s should do. Probably, the user would expect {(a,a,a), (a,a,b), ..} but the way you've proposed it, they would get {((a,a),a), ((a,a),b), ...} and have an unpleasant surprise. Raymond

On 7 October 2011 11:37, Jakob Bowyer <jkbbwr@gmail.com> wrote:
I'm not sure I'd agree, even though I come from a maths background. Explicit is better than implicit and all that... Even if it is slightly clearer to some people, I bet there are others (not from a mathematical background) who would be confused by it. And in that case, itertools.product is easier to google for than "*"...) And that's ignoring the cost of implementing, testing, documenting the change. Actually, just to give a flavour of the sorts of design decisions that would need to be considered, consider this:
So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different). That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work? The problem very quickly becomes a lot larger than you first assume. Operator overloading is used much more sparingly in Python than in, say, C++. It's as much a language style issue as anything else. Sorry, but I still don't see enough benefit to justify this. Paul.

I don't think having itertools.product is a good reason for not overloading the operator. The same argument could be said against having listA + listB or listA * 10. After all, those can all be done with list comprehensions and itertools aswell. itertools.product(a, b) or a list comprehension work fine for 2 sets, but if you try doing that for any significant number (which i presume the OP is), maybe 5 set operations in one expression, it quickly becomes completely unreadable: itertools.product(itertools.product(seta, setb.union(setc), setd.difference(sete)) vs seta * (setb | setc) * (setd & sete) We already have operators overloaded for union |, intersect &, difference -, symmetric difference ^. Having an operator for product would fit in perfectly if it could be done properly; ensuring that it is commutative looks non-trivial. On Fri, Oct 7, 2011 at 7:41 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:

On Fri, Oct 7, 2011 at 11:28 AM, Haoyi Li <haoyi.sg@gmail.com> wrote:
I expect what you actually intended here was: cp = product(seta, setb|setc, setd&sete) If you actually did want two distinct cartesian products, then the two line version is significantly easier to read: cp_a = product(setb|setc, setd&sete) cp_b = product(seta, cp_a) The ambiguity of chained multiplication is more than enough to kill the idea (although it was definitely worth asking the question). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Paul Moore writes:
So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different).
No, set multiplication is not going to be commutative: {'a'} X {1} is quite different from {1} X {'a'}. The problem you're pointing out is much worse: the obvious implementation of Cartesian product isn't even associative. -1 on an operator for this.

On 7 October 2011 18:15, Stephen J. Turnbull <turnbull@sk.tsukuba.ac.jp> wrote:
Bah. For all my claims of having a mathematical background (OK, so it was 30 years ago :-)) I can't even remember the difference between associativity and commutativity. I'm going for a lie down now... :-) Paul.

Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind? On Fri, Oct 7, 2011 at 7:13 PM, Paul Moore <p.f.moore@gmail.com> wrote:

On Fri, Oct 7, 2011 at 1:57 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Actually, that question makes me think that a meta-PEP on the process of vetting an idea before starting a real PEP might be worthwhile. In particular, that people should come up with real-world use cases and try and think of the worst abuses that might occur, etc. <mike

On Fri, Oct 7, 2011 at 5:30 PM, Mike Meyer <mwm@mired.org> wrote:
The process is basically 'ask on python-ideas if a web search doesn't show that it has already been proposed'. The next step will vary from "not going to happen" through "file a feature request on the tracker" to "write a PEP". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Oct 7, 2011 at 2:37 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
True, but not very useful. The idea would be to discuss what happens between posting and "the next step". People may well ask for use cases, look at abuses, do searches of the library for them (or ask you to do so), etc. <mike

On 7 October 2011 21:54, Jakob Bowyer <jkbbwr@gmail.com> wrote:
About the same, but be prepared for not everyone to be as enthusiastic about your idea as you are, and be open to the possibility that some of the objections are valid... (BTW, I thought I was offering you some insight into why ideas might not be as straightforward as the inventor thought. I apologise if it came across to you as me "shooting down in flames" your idea...) Paul.

Actually I like this style :) don't mistake my "shooting down in flames" to be aggressive dislike. I have taken your specific advice on board, I was asking in a more general context. So I'm actually quite happy this worked out this way gives me a chance to learn On Fri, Oct 7, 2011 at 10:21 PM, Paul Moore <p.f.moore@gmail.com> wrote:

Jakob Bowyer writes:
0. Be familiar with the Zen (both the official "python -m this" and Apocrypha, such as "not every 3-line function needs to be a builtin"). Try to see how they apply to discussions you read even when not explicitly mentioned. 1. Do check the archives, of this list and python-dev. There are some amazingly good teachers here. 2. If you're worried that the question might stupid or "obvious to the Dutch", you might float your trial balloon on python-list (aka comp.lang.python) first. 3. Make sure you know what the earlier problems with similar ideas were. At least that way you can often manage a soft landing. :-) 4. Don't let the experience stop you from trying again. There are no stupid questions -- except the unasked ones. (But see (2); maybe there's a more appropriate venue to ask the first time.)

On Oct 8, 6:54 am, Jakob Bowyer <jkb...@gmail.com> wrote:
An implementation is always helpful. It doesn't need to be complete but it will certainly help you iron out the idea and give everyone else something concrete to discuss. If it's possible/relevant to the suggestion, putting it up as a package on PyPI is also a good way to gauge usefulness.

On Fri, Oct 7, 2011 at 2:13 PM, Paul Moore <p.f.moore@gmail.com> wrote:
Associative: A * (B * C) == (A * B) * C Commutative: A * B == B * A Set multiplication as the Cartesian product is neither (as others have pointed out). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Oct 7, 2011 at 2:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Associative is a problem. Especially because there are two reasonable interpretations of it ( (a, (b, c)) vs. (a, b, c)). Commutative, not so much. We already have a number of non-commutative operators (-, /, // and % on most types), some of which behave that way in the real world of mathematics. We also have operators that are commutative on integers but not on other things (+ on lists and string). <mike

On 10/7/2011 7:20 AM, Paul Moore wrote:
Math used a different symbol, an 'X' without serifs, for cross-products. The result is a set of 'ordered pairs', which is different from a duple.
It is not *associative* -- unless one special-cases tuples to add non-tuple elements to tuple elements.
(the types of the elements in the 2 expressions above are different).
If the elements of A*B are sequences, then A*B is also not commutative. But it would be if the elements were sets instead of pairs.
That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work?
In math, *...* is a ternary operator with 3 args, like if...else in Python or ;...: in C, but it generalizes to an n-ary operator. From https://secure.wikimedia.org/wikipedia/en/wiki/Cartesian_product ''' The Cartesian product can be generalized to the n-ary Cartesian product over n sets X1, ..., Xn: X_1\times\cdots\times X_n = \{(x_1, \ldots, x_n) : x_i \in X_i \}. It is a set of n-tuples. If tuples are defined as nested ordered pairs, it can be identified to (X1 × ... × Xn-1) × Xn.''' In other words, think of aXbXc as XX(a,b,c) and similarly for more Xs. In Python, better to define XX explicitly. One can even write the n-fold generalization by simulating n nested for loops.
In most situations, one really needs an iterator that produces the pairs one at a time rather than a complete collection. And when one does want a complete collection, one might often want it ordered as a list rather than unordered as a set. Itertools.product covers all these use cases. And even that does not cover the n-ary case. For many combinatorial algorithms, one need a 'cross-concatenation' operator defined on collections of sequences which adds the pair of sequences in each cross-product pair. The same 'x' symbol, possibly circled, has been used for that. Both are in unicode, but Python is currently restricted to a sparse set of ascii symbols. -- Terry Jan Reedy

Terry Reedy writes:
If the elements of A*B are sequences, then A*B is also not commutative. But it would be if the elements were sets instead of pairs.
But that's not a Cartesian product. By definition in a Cartesian product order of element components matters. I don't think I've ever seen a set product like that, and have trouble imagining applications for it unmodified (typically when squaring a set the diagonal would cause problems).
In math, *...* is a ternary operator with 3 args, like if...else in Python or ;...: in C, but it generalizes to an n-ary operator.
A better analogy is to the comma or string concatenation. I don't know if that would lead to an associative implementation, though.
In most situations, one really needs an iterator that produces the pairs one at a time rather than a complete collection.
I don't see why you couldn't have an operator on two iterables that produces an iterator. But of course comprehension notation is hard to beat for that.
And when one does want a complete collection, one might often want it ordered as a list rather than unordered as a set.
I don't understand this. Sets are unordered; any order you impose on the product would be arbitrary. So iterate the product as a set, what else might be (commonly) wanted?

On 10/8/2011 3:42 AM, Georg Brandl wrote:
I could just say that they are different but nearly synonymous words used in different fields. A more precise answer depends on the meaning chosen for 'ordered pair' and '2-tuple'. If one takes 'ordered pair' as an undefined primitive (and generic) concept, then '2-tuple' is one specialization of the concept. https://secure.wikimedia.org/wikipedia/en/wiki/Ordered_pair says "Ordered pairs are also called 2-tuples, 2-dimensional vectors, or sequences of length 2." I take that as meaning that the latter three are specializations, as 'tuple' is definitely not the same as 'vector'. If one takes 'ordered pair' as a specialized set, then they different for a different reason. Tuple is not a subclass of set, at least not in Python. In practice, the two classes often have different interfaces. The two members of ordered pairs are the first and second. They are extracted by two different functions. Lisp cons cells with car (first) and cdr (rest) functions are an example. The two members of 2-tuples are also the 0-1 or 1-2 members and are usually extracted by indexing, which is one function taking two parameters. Python duples with 0-1 indexing are an example. -- Terry Jan Reedy

On Fri, Oct 7, 2011 at 6:37 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:
There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to **people from outside of a programming background**.
(emphasis added) These are not the only people writing and reading code, and decisions about syntax should favor improving the readability for coders across the board, not simply a single subset of them.
-- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy

On Fri, Oct 7, 2011 at 8:55 AM, Calvin Spealman <ironfroggy@gmail.com>wrote:
True. But unless there's another common meaning for multiplying sets, there are only two groups of people to consider: Those who know it as the cross product, and those who have no idea what it might mean. They former will be surprised by the current situation when it doesn't work, the latter will have to look it up when they run into it. it's not really any worse than using + for string concatenation. Except for the associativity issue. <mike

Mike Meyer wrote:
But unless there's another common meaning for multiplying sets,
While there might not be another common meaning for multiplying sets, it's not necessarily obvious that '*' applied to sets in a programming language means 'multiplication'. For example, Pascal uses '+' and '*' for set union and intersection, IIRC. -- Greg

Jakob Bowyer wrote:
There is that but from a math point of view the syntax a * b does make sence.
A problem with this is that it doesn't generalise smoothly to products of more than two sets. A mathematician would think of A x B x C as a set of 3-tuples, but in Python, A * B * C implemented the straightforward way would give you a set of 2-tuples, one element of which is another 2-tuple. Keeping it as a function allows products of arbitrarily many sets to be expressed naturally. -- Greg

Jakob Bowyer wrote:
I realise that the consensus is that the lack of associativity is a fatal problem with a Cartesian product operator, but there are at least two other issues I haven't seen. (1) "Using * for set product makes sense to mathematicians" -- maybe so, but those mathematicians already have to learn to use | instead of ∪ (union) and & instead of ∩ (intersection), so learning to use itertools.product() for Cartesian product is not a major burden for them. (2) Cartesian product is potentially very expensive. The Cartesian product of a moderate-sized set and another moderate-sized set could turn out to be a HUGE set. This is not a fatal objection, since other operations in Python are potentially expensive: alist*10000000 but at least it looks expensive. You're multiplying by a big number, of course you're going to require a lot of memory. But set multiplication can very easily creep up on you: aset*bset will have size len(aset)*len(bset) which may be huge even if neither set on their own is. Better to keep it as a lazy iterator rather than try to generate a potentially huge set in one go. -- Steven

Guido van Rossum wrote:
I didn't say it was a *good* argument <wink> I already acknowledged that there are expensive operations in Python, and some of them are done by operators. Perhaps I'm just over-sensitive to the risk of large Cartesian products, having locked up my desktop with a foolish list(product(seta, setb)) in exactly the circumstances I described above: both sets were moderate sizes, and it never dawned on me until my PC ground to a halt that the product would be so much bigger. (I blame myself for this error: I should know better than to carelessly pass an iterator to list without thinking, which is exactly what I did.) In my experience, most uses of list multiplication look something like this: [0]*len(arg) which is not huge except in the extreme case that arg is already huge. But the typical use of set multiplication is surely going to be something like: arg1*arg2 which may be huge even if neither of the args are. I don't think Cartesian product is important enough, or fundamental enough, to justify making it easier to inadvertently generate a huge set by mistake. That was all I tried to say. -- Steven

On Fri, Oct 7, 2011 at 12:35, Paul Moore <p.f.moore@gmail.com> wrote:
I don't think an operator form is sufficiently valuable when the functionality is available and clear enough already.
I'm not sure; if set multiplication is highly unambiguous (i.e. the Cartesian product is the only logical outcome, and there is not some other common multiplication-like operation on sets), than it seems to me that supporting the multiplication operator for the Cartesian product of sets would be sensible. Cheers, Dirkjan

On Fri, 7 Oct 2011 11:46:34 +0100 Jakob Bowyer <jkbbwr@gmail.com> wrote:
As far as I know and from asking my lecturer, multiplication only produces Cartesian products.
Given that multiplying a list or tuple repeats the sequence, there may be a certain amount of confusion. Also, I don't think itertools.product is common enough to warrant an operator. There's a very readable alternative:
Or, in the case where you only want to iterate, two nested loops will suffice and avoid building the container. Regards Antoine.

On Fri, Oct 7, 2011 at 6:24 AM, Jakob Bowyer <jkbbwr@gmail.com> wrote:
This idea might be aesthetically pleasing from a mathematical viewpoint, but it does not help people write better programs. It does not provide anything better than the status quo. In fact, it adds an obscure behavior that needs to be maintained, taught, and understood, making Python ever-so-slightly worse. -1 Mike

On Oct 7, 2011, at 6:24 AM, Jakob Bowyer wrote:
Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation.
-1 We already have multiple ways to do it (set comprehensions, itertools.product, ...). Also, it's much nicer to have an iterator than to fill memory with lots of little sets. Also, it is unclear what s*s*s should do. Probably, the user would expect {(a,a,a), (a,a,b), ..} but the way you've proposed it, they would get {((a,a),a), ((a,a),b), ...} and have an unpleasant surprise. Raymond
participants (21)
-
alex23
-
Amaury Forgeot d'Arc
-
Antoine Pitrou
-
Calvin Spealman
-
Dirkjan Ochtman
-
Ethan Furman
-
Georg Brandl
-
Greg Ewing
-
Guido van Rossum
-
Haoyi Li
-
Jakob Bowyer
-
Mike Graham
-
Mike Meyer
-
MRAB
-
Nick Coghlan
-
Paul Moore
-
Raymond Hettinger
-
Stephen J. Turnbull
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy