
The other day a colleague of mine submitted this challenge taken from some website to us coworkers: Have the function ArrayAddition(arr) take the array of numbers stored in arr and print true if any combination of numbers in the array can be added up to equal the largest number in the array, otherwise print false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output should print true because 4 + 6 + 10 + 3 = 23. The array will not be empty, will not contain all the same elements, and may contain negative numbers. Examples: 5,7,16,1,2 3,5,-1,8,12 I proposed this solution: from itertools import combinations, chain def array_test(arr): biggest = max(arr) arr.remove(biggest) my_comb = chain(*(combinations(arr,i) for i in range(1,len(arr)+1))) for one_comb in my_comb: if sum(one_comb) == biggest: return True, one_comb return False But then somebody else submitted this Haskell solution: import Data.List f :: (Ord a,Num a) => [a] -> Bool f lst = (\y -> elem y $ map sum $ subsequences $ [ x | x <- lst, x /= y ]) $ maximum lst I wonder if we should add a subsequences function to itertools or make the "r" parameter of combinations optional to return all combinations up to len(iterable).

On Mon, Apr 23, 2012 at 11:07 AM, Nestor <nestornissen@gmail.com> wrote:
Why? It just makes itertools using code that much harder to read, since you'd have yet another variant to learn. If you need it, then just define a separate "all_combinations()" that makes it clear what is going on (replace yield from usage with itertools.chain() for Python < 3.3): from itertools import combinations def all_combinations(data): for num_items in range(1, len(data)+1): yield from combinations(data, num_items) def array_test(arr): biggest = max(arr) data = [x for x in arr if x != biggest] for combo in all_combinations(data): if sum(combo) == biggest: return True, combo return False, None When a gain in brevity increases the necessary level of assumed knowledge for future maintainers, it isn't a clear win from a language design point of view. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 4/22/2012 9:07 PM, Nestor wrote:
The other day a colleague of mine submitted this challenge taken from some website to us coworkers:
This is more of a python-list post than a python-ideas post. So is my response.
Since the order of the numbers is arbitrary and irrelevant to the problem, it should be formulated in term of a set of numbers. Non-empty means that max(set) exists. No duplicates means that max(set) is unique and there is no question of whether to remove one or all copies.
Examples: 5,7,16,1,2
False (5+7+1+2 == 15)
3,5,-1,8,12
True (8+5-1 == 12)
I believe the above works for sets as well as lists.
The above is must clearer to me than the Haskell (or my equivalent) below, and I suspect that would be true even if I really knew Haskell. If you want something more like the Haskell version, try return any(filter(x == biggest, map(sum, my_comb))) To make it even more like a one-liner, replace my_comb with its expression.
If you really want one logical line that I believe matches the above ;-) def sum_to_max(numbers): return ( (lambda num_max: (lambda nums: any(filter(lambda n: n == num_max, map(sum, chain(*(combinations(nums,i) for i in range(1,len(nums)+1)))))) ) (numbers - {num_max}) # because numbers.remove(num_max)is None ) (max(numbers)) ) print(sum_to_max({5,7,16,1,2}) == False, sum_to_max({3,5,-1,8,12}) == True) # sets because of numbers - {num_max} True, True Comment 1: debugging nested expressions is harder than debugging a sequence of statements because cannot insert print statements. Comment 2: the function should really return an iterator that yields the sets that add to the max. Then the user can decide whether or not to collapse the information to True/False and stop after the first.
I think not. The goal of itertools is to provide basic iterators that can be combined to produce specific iterators, just as you did. -- Terry Jan Reedy

On 4/22/2012 11:18 PM, Chris Rebert wrote:
Er, yes. Given the examples, I (too quickly) misread 'will not contain all the same elements' as 'no duplicates'. In any case, a set was needed for the functional version as there is no 'list1 - list2' expression that returns the list1 minus the items in list2. (Well, I could have defined an auxiliary list sub function, but that is beside the point of the example.) -- Terry Jan Reedy

Nestor wrote: [...]
You really don't do your idea any favours by painting it as "Haskell envy". I have mixed ideas on this. On the one hand, subsequences is a natural function to include along with permutations and combinations in any combinatorics tool set. As you point out, Haskell has it. So does Mathematica, under the name "subsets". But on the other hand, itertools is arguably not the right place for it. If we add subsequences, will people then ask for partitions and derangements next? Where will it end? At some point the line needs to be drawn, with some functions declared "too specialised" for the general itertools module. (Perhaps there should be a separate combinatorics module.) And on the third hand, what you want is a simple two-liner, easy enough to write in place: from itertools import chain, combinations allcombs = chain(*(combinations(data, i) for i in range(len(data)+1))) which is close to what your code used. However, the discoverability of this solution is essentially zero (if you don't know how to do this, it's hard to find out), and it is hardly as self-documenting as from itertools import subsequences allcombs = subsequences(data) Overall, I would vote +0.5 on a subsequences function. -- Steven

On 23 April 2012 17:52, Steven D'Aprano <steve@pearwood.info> wrote:
It seems to me that you're precisely right here - it's *not* right to keep adding this sort of thing to itertools, or as you say it will never end. On the other hand, it's potentially useful, and not necessarily immediately obvious. So I'd say it's exactly right for a 3rd party "combinatorics" module on PyPI. Or if no-one thinks it's sufficiently useful to write and maintain one, then as a simple named function in your application. Once your application has a combinatorics module within it, that's the point where you should think about releasing that module separately... Of course, I'm arguing theoretically here. I've never even used itertools.combinations, so I have no real need for *any* of this. If people who do use this sort of thing on a regular basis have other opinions, then that's a much stronger argument :-) Paul.

On Tue, Apr 24, 2012 at 4:50 AM, Paul Moore <p.f.moore@gmail.com> wrote:
In practice, you'll be doing prefiltering and other conditioning on your combinations to weed out nonsensical variants cheaply, or your combinations will be coming from a database query or map reduce result. So I personally think it's the kind of thing that comes up in toy programming exercises and various academic explorations rather than something that solves a significant practical need. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Why not repeat the same practice as is currently in place for range, enumerate, etc. Take 3 arguments, the iterables, the repeat minimum size and the repeat maximum size. If the third argument is None, just treat the minimum as the maximum (yielding the current behavior). Most things in python have gobs of keyword arguments and most of itertools has relatively few. I don't think it is going to harm anyone to give them a few more toys to play with.

On Apr 23, 2012, at 9:52 AM, Steven D'Aprano wrote:
However, the discoverability of this solution is essentially zero
That exact code has been in the documentation for years: def powerset(iterable): "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)" s = list(iterable) return chain.from_iterable(combinations(s, r) for r in range(len(s)+1)) The whole purpose of the itertools recipes are to teach how the itertools can be readily combined to build new tools. http://docs.python.org/library/itertools.html#module-itertools Raymond

On 4/23/2012 3:55 PM, Raymond Hettinger wrote:
Raymond's "that code has been in the docs for years," and Steven's "the discoverability of this solution is essentially zero" are not contradictions. It sounds like we need a better way to find the information in the itertools docs. For example, there is no index entry for "powerset", and I don't know what term Steven tried looking it up with. Sounds like you two could work together to make people more aware of the tools we've already got. --Ned.

Nestor wrote:
By the way, your solution is wrong. Consider this sample data: -2, -5, 0, -1, -3 In this case, the Haskell solution should correctly print true, while yours will print false, because you skip the empty subset. The empty sum equals the maximum value of the set, 0. The ease at which people can get this wrong is an argument in favour of a standard solution. -- Steven

On Mon, Apr 23, 2012 at 11:07 AM, Nestor <nestornissen@gmail.com> wrote:
Why? It just makes itertools using code that much harder to read, since you'd have yet another variant to learn. If you need it, then just define a separate "all_combinations()" that makes it clear what is going on (replace yield from usage with itertools.chain() for Python < 3.3): from itertools import combinations def all_combinations(data): for num_items in range(1, len(data)+1): yield from combinations(data, num_items) def array_test(arr): biggest = max(arr) data = [x for x in arr if x != biggest] for combo in all_combinations(data): if sum(combo) == biggest: return True, combo return False, None When a gain in brevity increases the necessary level of assumed knowledge for future maintainers, it isn't a clear win from a language design point of view. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 4/22/2012 9:07 PM, Nestor wrote:
The other day a colleague of mine submitted this challenge taken from some website to us coworkers:
This is more of a python-list post than a python-ideas post. So is my response.
Since the order of the numbers is arbitrary and irrelevant to the problem, it should be formulated in term of a set of numbers. Non-empty means that max(set) exists. No duplicates means that max(set) is unique and there is no question of whether to remove one or all copies.
Examples: 5,7,16,1,2
False (5+7+1+2 == 15)
3,5,-1,8,12
True (8+5-1 == 12)
I believe the above works for sets as well as lists.
The above is must clearer to me than the Haskell (or my equivalent) below, and I suspect that would be true even if I really knew Haskell. If you want something more like the Haskell version, try return any(filter(x == biggest, map(sum, my_comb))) To make it even more like a one-liner, replace my_comb with its expression.
If you really want one logical line that I believe matches the above ;-) def sum_to_max(numbers): return ( (lambda num_max: (lambda nums: any(filter(lambda n: n == num_max, map(sum, chain(*(combinations(nums,i) for i in range(1,len(nums)+1)))))) ) (numbers - {num_max}) # because numbers.remove(num_max)is None ) (max(numbers)) ) print(sum_to_max({5,7,16,1,2}) == False, sum_to_max({3,5,-1,8,12}) == True) # sets because of numbers - {num_max} True, True Comment 1: debugging nested expressions is harder than debugging a sequence of statements because cannot insert print statements. Comment 2: the function should really return an iterator that yields the sets that add to the max. Then the user can decide whether or not to collapse the information to True/False and stop after the first.
I think not. The goal of itertools is to provide basic iterators that can be combined to produce specific iterators, just as you did. -- Terry Jan Reedy

On 4/22/2012 11:18 PM, Chris Rebert wrote:
Er, yes. Given the examples, I (too quickly) misread 'will not contain all the same elements' as 'no duplicates'. In any case, a set was needed for the functional version as there is no 'list1 - list2' expression that returns the list1 minus the items in list2. (Well, I could have defined an auxiliary list sub function, but that is beside the point of the example.) -- Terry Jan Reedy

Nestor wrote: [...]
You really don't do your idea any favours by painting it as "Haskell envy". I have mixed ideas on this. On the one hand, subsequences is a natural function to include along with permutations and combinations in any combinatorics tool set. As you point out, Haskell has it. So does Mathematica, under the name "subsets". But on the other hand, itertools is arguably not the right place for it. If we add subsequences, will people then ask for partitions and derangements next? Where will it end? At some point the line needs to be drawn, with some functions declared "too specialised" for the general itertools module. (Perhaps there should be a separate combinatorics module.) And on the third hand, what you want is a simple two-liner, easy enough to write in place: from itertools import chain, combinations allcombs = chain(*(combinations(data, i) for i in range(len(data)+1))) which is close to what your code used. However, the discoverability of this solution is essentially zero (if you don't know how to do this, it's hard to find out), and it is hardly as self-documenting as from itertools import subsequences allcombs = subsequences(data) Overall, I would vote +0.5 on a subsequences function. -- Steven

On 23 April 2012 17:52, Steven D'Aprano <steve@pearwood.info> wrote:
It seems to me that you're precisely right here - it's *not* right to keep adding this sort of thing to itertools, or as you say it will never end. On the other hand, it's potentially useful, and not necessarily immediately obvious. So I'd say it's exactly right for a 3rd party "combinatorics" module on PyPI. Or if no-one thinks it's sufficiently useful to write and maintain one, then as a simple named function in your application. Once your application has a combinatorics module within it, that's the point where you should think about releasing that module separately... Of course, I'm arguing theoretically here. I've never even used itertools.combinations, so I have no real need for *any* of this. If people who do use this sort of thing on a regular basis have other opinions, then that's a much stronger argument :-) Paul.

On Tue, Apr 24, 2012 at 4:50 AM, Paul Moore <p.f.moore@gmail.com> wrote:
In practice, you'll be doing prefiltering and other conditioning on your combinations to weed out nonsensical variants cheaply, or your combinations will be coming from a database query or map reduce result. So I personally think it's the kind of thing that comes up in toy programming exercises and various academic explorations rather than something that solves a significant practical need. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Why not repeat the same practice as is currently in place for range, enumerate, etc. Take 3 arguments, the iterables, the repeat minimum size and the repeat maximum size. If the third argument is None, just treat the minimum as the maximum (yielding the current behavior). Most things in python have gobs of keyword arguments and most of itertools has relatively few. I don't think it is going to harm anyone to give them a few more toys to play with.

On Apr 23, 2012, at 9:52 AM, Steven D'Aprano wrote:
However, the discoverability of this solution is essentially zero
That exact code has been in the documentation for years: def powerset(iterable): "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)" s = list(iterable) return chain.from_iterable(combinations(s, r) for r in range(len(s)+1)) The whole purpose of the itertools recipes are to teach how the itertools can be readily combined to build new tools. http://docs.python.org/library/itertools.html#module-itertools Raymond

On 4/23/2012 3:55 PM, Raymond Hettinger wrote:
Raymond's "that code has been in the docs for years," and Steven's "the discoverability of this solution is essentially zero" are not contradictions. It sounds like we need a better way to find the information in the itertools docs. For example, there is no index entry for "powerset", and I don't know what term Steven tried looking it up with. Sounds like you two could work together to make people more aware of the tools we've already got. --Ned.

Nestor wrote:
By the way, your solution is wrong. Consider this sample data: -2, -5, 0, -1, -3 In this case, the Haskell solution should correctly print true, while yours will print false, because you skip the empty subset. The empty sum equals the maximum value of the set, 0. The ease at which people can get this wrong is an argument in favour of a standard solution. -- Steven
participants (10)
-
Chris Rebert
-
Devin Jeanpierre
-
Nathan Rice
-
Ned Batchelder
-
Nestor
-
Nick Coghlan
-
Paul Moore
-
Raymond Hettinger
-
Steven D'Aprano
-
Terry Reedy