[Tutor] Use functions re avoid Re: Can the following algorithmbe improved?
Brian van den Broek
bvande at po-box.mcgill.ca
Tue Aug 9 23:23:39 CEST 2005
Nathan Pinno said unto the world upon 2005-08-09 15:35:
> Hey all,
>
> I just fixed it so that it prints. The problem was that it returned the
> cards instead of printing them.
>
> Nathan
>
>> Hey Brian and all,
>>
>> Just coded in and tried your program Brian. Doesn't print a thing.
>>
Hi Nathan,
yes, I should have pointed out that I made it return the cards dealt
rather than printing them and explained why I made it do that. But, I
just did it out of force of habit, so I didn't notice I'd changed the
functionality. Sorry about that.
So, why do that? Well, it might not matter too much in this case, but
the idea is to make each function responsible for one thing. That has
several advantages. The two main ones IMO: 1) it keeps the functions
small which aids comprehension and debugging, and 2) it makes the
functions easier to string together to do other things.
For instance, if what you want right now is for the cards to be
printed, it is easy enough to use something like my code[*] and then
have another function which calls the dealing function and prints the
results. If later you should want to use the results of a deal in a
more elaborate manner than just printing them (like if you make a card
game which needs to know what was dealt) you've got the deal returned
and you can do with it what you want to. If you don't return and just
print from the sort of code I gave, new uses will probably require
changing the code rather than just adding new function to work with it.
One approach would say don't add it now; instead do the minimal thing
that meets your needs because just thinking you will need it later
doesn't mean you will. But, in the case at hand, the future need is
pretty likely, and the modularity of functions (one chunk doing the
generation, the other doing the processing of the results) also speaks
for doing it the return way.
[*] But don't use my code. I was doing by hand what Roel pointed out
can be done by the random module itself. (Thanks, Roel! I didn't know
about random.sample.) Since whoever added sample to random is a better
programmer than you or I, use sample :-)
>> I thought the odds were more like 6/52! or 1/(1.3443029195*10^84).
<snip>
>>> Good question. But notice that your code has the same problem. Nothing
>>> stops the (dramatically unlikely) even that a-u all get assigned the
>>> same
>>> card. Over 6 cards, it will certainly turn up that you have given the
>>> same
>>> card twice with this code, too. (The way I figure it:
>>>
>>> >>> ( (52**6) - (52*51*50*49*48*47.0) ) / (52**6)
>>> 0.25858966166881681
>>>
>>> 25% of the time.)
I don't think I got the math wrong. My thinking is there are 52**6
ways to choose 6 items from 52 if you don't care about duplicates, and
52*51*50*49*48*47 ways to choose if you do. (The first card can be any
one of 52, the second any one of the 51 left, etc. 52! by contrast
gives you all the ways 52 items can be ordered in a non-repeating
sequence.)
Here is a very quick (and I am sure not too efficient) bit of code to
give empirical support to my analysis:
<code>
import random
def get_choice(n, source):
'''Returns a sequence of n independent random choices from source'''
choice = []
for i in range(n):
choice.append(random.choice(source))
return choice
def has_dups(sequence):
'''Returns True if sequence contains duplicates, False otherwise'''
for i in sequence[:-1]:
# No point checking the last member
if sequence.count(i) > 1:
# We can bail, as duplicate found
return True
return False
def get_dup_percentage(runs, choice_length, source_length):
'''Returns percentage of duplicates in a series of choices over a
sequence.
The function make runs many choices of length choice_length from
a sequence
of length source_length and returns percentage of those choices that
contain duplication.'''
count = 0.0 # 0.0 for type coercion
dups = 0
source_sequence = range(source_length)
while count < runs:
choice_sequence = get_choice(choice_length, source_sequence)
if has_dups(choice_sequence):
dups += 1
count += 1
return dups / count * 100
print get_dup_percentage(100000, 6, 52)
</code>
Run that and I think you will find the values cluster around 25.85%.
(The code was written to be general. Play around with the values in
the final [print] line. I get that 9 choices are enough over a 52
termed sequence to give you more than 50% odds of duplicates.)
Best,
Brian vdB
More information about the Tutor
mailing list