[Tutor] Use functions re avoid Re: Can the following algorithmbe improved?

Brian van den Broek bvande at po-box.mcgill.ca
Tue Aug 9 23:23:39 CEST 2005

Nathan Pinno said unto the world upon 2005-08-09 15:35:
> Hey all,
> I just fixed it so that it prints. The problem was that it returned the 
> cards instead of printing them.
> Nathan

 >> Hey Brian and all,
 >> Just coded in and tried your program Brian. Doesn't print a thing.

Hi Nathan,

yes, I should have pointed out that I made it return the cards dealt 
rather than printing them and explained why I made it do that. But, I 
just did it out of force of habit, so I didn't notice I'd changed the 
functionality. Sorry about that.

So, why do that? Well, it might not matter too much in this case, but 
the idea is to make each function responsible for one thing. That has 
several advantages. The two main ones IMO: 1) it keeps the functions 
small which aids comprehension and debugging, and 2) it makes the 
functions easier to string together to do other things.

For instance, if what you want right now is for the cards to be 
printed, it is easy enough to use something like my code[*] and then 
have another function which calls the dealing function and prints the 
results. If later you should want to use the results of a deal in a 
more elaborate manner than just printing them (like if you make a card 
game which needs to know what was dealt) you've got the deal returned 
and you can do with it what you want to. If you don't return and just 
print from the sort of code I gave, new uses will probably require 
changing the code rather than just adding new function to work with it.

One approach would say don't add it now; instead do the minimal thing 
that meets your needs because just thinking you will need it later 
doesn't mean you will. But, in the case at hand, the future need is 
pretty likely, and the modularity of functions (one chunk doing the 
generation, the other doing the processing of the results) also speaks 
for doing it the return way.

[*] But don't use my code. I was doing by hand what Roel pointed out 
can be done by the random module itself. (Thanks, Roel! I didn't know 
about random.sample.) Since whoever added sample to random is a better 
programmer than you or I, use sample :-)

>> I thought the odds were more like 6/52! or 1/(1.3443029195*10^84).


>>> Good question. But notice that your code has the same problem. Nothing
>>> stops the (dramatically unlikely) even that a-u all get assigned the 
>>> same
>>> card. Over 6 cards, it will certainly turn up that you have given the 
>>> same
>>> card twice with this code, too. (The way I figure it:
>>> >>> ( (52**6) - (52*51*50*49*48*47.0) ) / (52**6)
>>> 0.25858966166881681
>>> 25% of the time.)

I don't think I got the math wrong. My thinking is there are 52**6 
ways to choose 6 items from 52 if you don't care about duplicates, and 
52*51*50*49*48*47 ways to choose if you do. (The first card can be any 
one of 52, the second any one of the 51 left, etc. 52! by contrast 
gives you all the ways 52 items can be ordered in a non-repeating 

Here is a very quick (and I am sure not too efficient) bit of code to 
give empirical support to my analysis:

import random

def get_choice(n, source):
     '''Returns a sequence of n independent random choices from source'''
     choice = []
     for i in range(n):
     return choice

def has_dups(sequence):
     '''Returns True if sequence contains duplicates, False otherwise'''
     for i in sequence[:-1]:
         # No point checking the last member
         if sequence.count(i) > 1:
             # We can bail, as duplicate found
             return True
     return False

def get_dup_percentage(runs, choice_length, source_length):
     '''Returns percentage of duplicates in a series of choices over a 

     The function make runs many choices of length choice_length from 
a sequence
     of length source_length and returns percentage of those choices that
     contain duplication.'''
     count = 0.0  # 0.0 for type coercion
     dups = 0
     source_sequence = range(source_length)
     while count < runs:
         choice_sequence = get_choice(choice_length, source_sequence)
         if has_dups(choice_sequence):
             dups += 1
         count += 1
     return dups / count * 100

print get_dup_percentage(100000, 6, 52)

Run that and I think you will find the values cluster around 25.85%.

(The code was written to be general. Play around with the values in 
the final [print] line. I get that 9 choices are enough over a 52 
termed sequence to give you more than 50% odds of duplicates.)


Brian vdB

More information about the Tutor mailing list