Smarter algo, was Re: 03 digression by brute force
Avi Gross
avigross at verizon.net
Fri Dec 14 19:13:18 EST 2018
REAL SUBJECT: Analysis of alternate algorithms.
Peter & Jach and anyone interested,
As Peter said in his altered subject line, Jack changed directions from tweaking an algorithm to trying something quite different.
Reminder of the problem.
Horizontal View:
SEND + MORE = MONEY
Vertical View:
SEND
+MORE
...........
MONEY
Hard to be precise as I am sticking to plain text in my message. The three words above are meant to be right adjusted.
When solving it horizontally, Jach and I used variants of a brute force approach. Substitute all possible combinations. He did it in-line and used eval. I did it by making lists of items on both sides and summing the int() of each component and comparing. We tried both our algorithms and his was slower and he profiled that the cause was that eval() was much more expensive as were his use of regular expression functions. For comparison, mine used int() and string manipulation functions and sets and dictionaries.
But the real puzzle is meant to be solved in a more vertical way by humans using logic. I won't do that here but note we have 4 equations going down but 8 unknowns. And the equations are not unique.
The rightmost column (I will call it the first as our arithmetic proceeds from right to left) is meant to represent ONES and provides the equation:
(D+E== Y) or (D+E == Y + 10)
Worse, for the next column, you either get a "1" carried from the previous addition or not and either pass a "1" along to the next column or not. 4 Possibilities.
(N+R==E) or (N+R+1==E) or (N+R==E+10) or (N+R+1==E+10)
Getting a headache yet?
For a human, they need a way to come up with additional info in terms of constraints.
There is a guiding inequality that looks like this:
S is not the same as any of the others. Anytime you solve for another, the list of possible values for S shrinks.
Ditto for each other variable.
Or, since anything plus 0 is itself, then D and E adding up to Y (with no possible carry) cannot be 0.
But getting a computer to do this might be a bit harder than blunt-force searches. So let's see why Jach's new algorithm was faster.
The code I am analyzing can be viewed in the archives and will not be entered again:
https://mail.python.org/pipermail/python-list/2018-December/738454.html
So what did Jach try in his newer version? It is what I call a vertical approach but one a computer can do easier than a human can or would. I see it is a very specific algorithm that hard codes in these variables as global entities that are altered by a set of nested functions. S, E, N, D, M, O, R, Y. There are approaches that might be better such as passing a dictionary partially filled out from one function to the next as the only one that prints the solution is the final function call.
So his is not a general solution.
What interests me as a probable reason this is faster is the number of situations it tries. The earlier versions asked itertools.permutations() to provide all unique combinations of ten tokens in eight positions. So there were 10 choices for the first and 9 for the second and so on adding up to 10!/2! or 1,814,400 different combinations tried. That approaches 2 million.
Jack broke the problem down into evaluating the units column with a loop like this:
itertools.permutations(range(10), 3)
That is 720 possibilities. He then doubled that to 1,440 to consider a carry. Only if the selected values for the three variables in contention (plus a carry) does he go on to call to evaluate the tens column.
It then shrinks a bit more as he no longer gets the permutations of all 10 digits. He subtracts the three values that might work for the units, so he is asking for permutations of 7 digits, two at a time. That is 42, doubled again to 84 for carry purposes. And this function is not called 1,440 times, but quite a bit fewer.
So, similarly, of those 84 loops for tens, he only sometimes calls to evaluate hundreds. As mentioned, the set of 10 digits shrinks some more and this continues upward to functions that evaluate hundreds and thousands and finally the one evaluating ten thousands pretty much prints out an answer it finds.
So overall iterations can be shown to drop. We could add code to measure how many times each function is called and come up with an exact value for this built-in problem. I did and the functions were called this many times:
>>> counting
{'unit': 1, 'ten': 72, 'hundred': 290, 'thou': 183, '10thou': 196}
>>> sum(counting.values())
742
But I also see the permutation function was called 542 times. Most of those runs though were fairly trivial as the combined number of items issued back were only 7,390 as compared to nearly two million in the earlier version. Overall, this is much faster.
Range() is probably a highly efficient built-in written in C, but it can totally be eliminated in this code as it is a constant. You can write {1,2,2,3,4,5,6,7,8,9} or [0,1] instead of calculating ranges.
Naturally, this code does not scale up to finding the two solutions for:
HAWAII + IDAHO +IOWA + OHIO = STATES
The horizontal versions solved that easily.
The carry issues get more complex here even if a general solution is written. One approach might be to make a matrix as wide as the widest term and place all entries in it as characters, right justified. You can then work on any number of columns by extracting columns backwards from the right and applying whatever logic, perhaps recursively. The possible carry amounts need to be estimated as no more than about the number of items being added minus one.
But, I am ready to do other things so I leave it as an exercise for the reader 😉
-----Original Message-----
From: Python-list <python-list-bounces+avigross=verizon.net at python.org> On Behalf Of Peter Otten
Sent: Friday, December 14, 2018 3:21 AM
To: python-list at python.org
Subject: Smarter algo, was Re: 03 digression by brute force
jfong at ms4.hinet.net wrote:
> Just for fun:-) On my ooold PC, it takes 0.047 seconds to run the
> following algorithm on the problem 'SNED + MORE == MONEY".
> def tenThousand(u, Cin): # Cin == M
> global n
> if Cin == M:
> print(S, E, N, D, '+', M, O, R, E, '==', M, O, N, E, Y)
> n += 1
And it probably took under two minutes with the brute force approach.
So if you were only interested in the result the code below were efficient only if it took at most two more minutes to write it;)
But seriously, the next step is to generalize this to work with arbitrary additions...
More fun!
--
https://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list