[Python-Dev] Re: A `cogen' module [was: Re: PEP 218 (sets); moving
set.py to Lib]
Tim Peters
tim.one@comcast.net
Wed, 28 Aug 2002 15:36:57 -0400
This is a multi-part message in MIME format.
--Boundary_(ID_K+WtxWU4+owRK5lyzEc5lw)
Content-type: text/plain; charset=iso-8859-1
Content-transfer-encoding: 7BIT
[Guido]
> I'm not saying that it looks good overall -- I'd like to defer to Tim,
> who has used and written this kind of utities for real, and who
> probably has a lot of useful feedback. Right now, he's felled by some
> kind of ilness though.
I think that's over! I'm very tired, though (couldn't get to sleep until
11, and woke up at 2 with the last, umm, episode <wink>).
This is a Big Project if done right. I volunteered time for it a few years
ago, but there wasn't enough interest then to keep it going. I'll attach
the last publicly-distribued module I had then, solely devoted to
combinations. It was meant to be the first in a series, all following some
basic design decisions:
+ A Basic class that doesn't compromise on speed, typically by working
on canonical representatives in Python list-of-int form.
+ A more general class that deals with arbitrary sequences, perhaps
at great loss of efficiency.
+ Multiple iterators are important: lex order is needed sometimes;
Gray code order is an enormous help sometimes; random generation is
vital sometimes.
+ State-of-the-art algorithms. That's a burden for anything that
goes into the core -- if it's a toy algorithm, users can
do just as well on their own, and then people submit patch after
patch that the original author isn't really qualified to judge
(else they would have done a state-of-the-art thing to begin with).
+ The ability to override the random number generator. Python's
default WH generator is showing its age as machines get faster;
it's simply not adequate anymore for long-running programs making
heavy use of it on a fast box. Combinatorial algorithms in
particular do tend to make heavy use of it. (Speaking of which,
"someone" should look into grabbing one of the Mersenne Twister
extensions for Python -- that's the current state of *that* art).
Ideas not worth taking:
+ Leave the chi-square algorithm out of it. A better implementation
would be nice to have in a statistics package, but it doesn't
belong here regardless.
me-i'm-going-back-to-sleep-ly y'rs - tim
--Boundary_(ID_K+WtxWU4+owRK5lyzEc5lw)
Content-type: text/plain; name=combgen.py
Content-transfer-encoding: 7BIT
Content-disposition: attachment; filename=combgen.py
# Module combgen version 0.9.1
# Released to the public domain 18-Dec-1999,
# by Tim Peters (tim_one@email.msn.com).
# Provided as-is; use at your own risk; no warranty; no promises; enjoy!
"""\
CombGen(s, k) supplies methods for generating k-combinations from s.
CombGenBasic(n, k) acts like CombGen(range(n), k) but is more
efficient.
s is of any sequence type such that s supports catenation (s1 + s2)
and slicing (s[i:j]). For example, s can be a list, tuple or string.
k is an integer such that 0 <= k <= len(s).
A k-combination of s is a subsequence C of s where len(C) = k, and for
some k integers i_0, i_1, ..., i_km1 (km1 = k-1) with
0 <= i_0 < i_1 < ... < i_km1 < len(s),
C[0] is s[i_0]
C[1] is s[i_1]
...
C[k-1] is s[i_km1]
Note that each k-combination is a sequence of the same type as s.
Different methods generate k-combinations in lexicographic index
order, a particular "Gray code" order, or at random.
The .reset() method can be used to start over.
The .set_start(ivector) method can be used to force generation to
begin at a particular combination.
Module function comb(n, k) returns the number of combinations of n
things taken k at a time; n >= k >= 0 required.
CAUTIONS
+ The CombGen constructor saves a reference to (not a copy of) s, so
don't mutate s after calling CombGen.
+ For efficiency, CombGenBasic getlex and getgray return the *same*
list each time, mutating it in place. You must not mutate this
list; and, if you want to save a combination's value across calls,
copy the list. For example,
>>> g = CombGenBasic(2, 1)
>>> x = g.getlex(); y = g.getlex()
>>> x is y # the same!
1
>>> x, y # so these print the same thing
([1], [1])
>>> g.reset()
>>> x = g.getlex()[:]; y = g.getlex()[:]
>>> x, y # copies work as expected
([0], [1])
>>>
In contrast, CombGen methods return a new sequence each time -- but
they're slower.
GETLEX -- LEXICOGRAPHIC GENERATION
Each invocation of .getlex() returns a new k-combination of s. The
combinations are generated in lexicographic index order (for
CombGenBasic, the k-combinations themselves are in lexicographic
order). That is, the first k-combination consists of
s[0], s[1], ..., s[k-1]
in that order; the next of
s[0], s[1], ..., s[k]
and so on until reaching
s[len(s)-k], s[len(s)-k+1], ..., s[len(s)-1]
After all k-combinations have been generated, .getlex() returns None.
Examples:
>>> g = CombGen("abc", 0).getlex
>>> g(), g()
('', None)
>>> g = CombGen("abc", 1).getlex
>>> g(), g(), g(), g()
('a', 'b', 'c', None)
>>> g = CombGenBasic(3, 2).getlex
>>> print g(), g(), g(), g()
[0, 1] [0, 2] [1, 2] None
>>> g = CombGen((0, 1, 2), 3).getlex
>>> print g(), g(), g()
(0, 1, 2) None None
>>> p = CombGenBasic(4, 2)
>>> g = p.getlex
>>> print g(), g(), g(), g(), g(), g(), g(), g()
[0, 1] [0, 2] [0, 3] [1, 2] [1, 3] [2, 3] None None
>>> p.reset()
>>> print g(), g(), g(), g(), g(), g(), g(), g()
[0, 1] [0, 2] [0, 3] [1, 2] [1, 3] [2, 3] None None
>>>
GETGRAY -- GRAY CODE GENERATION
Each invocation of .getgray() returns a triple
C, tossed, added
where
C is the next k-combination of s
tossed is the element of s removed from the last k-combination
added is the element of s added to the last k-combination
tossed and added are None for the first call.
Consecutive combinations returned by .getgray() differ by two elements
(one removed, one added). If you invoke getgray() more than comb(n,k)
times, it "wraps around" and generates the same sequence again. Note
that the last combination in the return sequence also differs by two
elements from the first combination in the return sequence.
Gray code ordering can be very useful when you're computing an
expensive function on each combination: that exactly one element is
added and exactly one removed can often be exploited to save
recomputation for the k-2 common elements.
>>> o = CombGen("abcd", 2)
>>> for i in range(7): # note that this wraps around
... print o.getgray()
('ab', None, None)
('bd', 'a', 'd')
('bc', 'd', 'c')
('cd', 'b', 'd')
('ad', 'c', 'a')
('ac', 'd', 'c')
('ab', 'c', 'b')
>>>
GETRAND -- RANDOM GENERATION
Each invocation of .getrand() returns a random k-combination.
>>> o = CombGenBasic(1000, 6)
>>> import random
>>> random.seed(87654)
>>> o.getrand()
[69, 223, 437, 573, 722, 778]
>>> o.getrand()
[409, 542, 666, 703, 732, 847]
>>> CombGenBasic(1000000, 4).getrand()
[199449, 439831, 606885, 874530]
>>>
"""
# 0,0,1 09-Dec-1999
# initial version
# 0,0,2 10-Dec-1999
# Sped CombGenBasic.{getlex, getgray} substantially by no longer
# making copies of the indices; getgray is truly O(1) now.
# A bad aspect is that they return the same list object each time
# now, which can be confusing; e.g., had to change some examples.
# Use CombGen instead if this bothers you -- CombGenBasic's
# purpose in life is to be lean & mean.
# Removed the restriction on mixing calls to CombGenBasic's
# getlex and getgray; not sure it's useful, but it was irksome.
# Changed __findj to return a simpler result. This is less useful
# for getgray, but now getlex can exploit it too (there are no
# longer any Python-level loops in CombGenBasic's getlex; there's
# an implied C-level loop (via "range"), and it's in the nature of
# lex order that this can't be removed).
# Added some exhaustive tests for getlex, and finger verification.
# 0,9,1 18-Dec-1999
# Changed _testrand to compute and print chi-square statistics,
# and probabilities, because one of _testrand's outputs didn't
# "look random" to me. Indeed, it's got a poor chi-square value!
# But sometimes that *should* happen, and it does not appear to
# be happening more often than expected.
__version__ = 0, 9, 1
def _chop(n):
"""n -> int if it fits, else long."""
try:
return int(n)
except OverflowError:
return n
def comb(n, k):
"""n, k -> number of combinations of n items, k at a time.
n >= k >= 0 required.
>>> for i in range(7):
... print "comb(6, %d) ==" % i, comb(6, i)
comb(6, 0) == 1
comb(6, 1) == 6
comb(6, 2) == 15
comb(6, 3) == 20
comb(6, 4) == 15
comb(6, 5) == 6
comb(6, 6) == 1
>>> comb(52, 5) # number of poker hands
2598960
>>> comb(52, 13) # number of bridge hands
635013559600L
"""
if not n >= k >= 0:
raise ValueError("n >= k >= 0 required: " + `n, k`)
if k > (n >> 1):
k = n-k
if k == 0:
return 1
result = long(n)
i = 2
n, k = n-1, k-1
while k:
# assert (result * n) % i == 0
result = result * n / i
i = i+1
k = k-1
n = n-1
return _chop(result)
import random
class CombGenBasic:
def __init__(self, n, k):
self.n, self.k = n, k
if not n >= k >= 0:
raise ValueError("n >= k >= 0 required:" + `n, k`)
self.reset()
def reset(self):
"""Restore state to that immediately after construction."""
# The first result is the same for either lexicographic or
# Gray code generation.
self.set_start(range(self.k))
# __findj is used only to initialize self.j for getlex and
# getgray. It returns the largest j such that slot j has
# "breathing room"; that is, such that slot j isn't at its largest
# possible value (n-k+j). j is -1 if no such index exists.
# After initialization, getlex and getgray incrementally update
# this more efficiently.
def __findj(self, v):
n, k = self.n, self.k
assert len(v) == k
j = k-1
while j >= 0 and v[j] == n-k+j:
# v[j] is at its largest possible value
j = j-1
return j
def getlex(self):
"""Return next (in lexicographic order) k-combination.
Return None if all possibilities have been generated.
Caution: getlex returns the *same* list each time, mutated
in place. Don't mutate it yourself, or save a reference to it
(the next call will mutate its contents; make a copy if you
need to save the value across calls).
"""
indices, n, k, j = self.indices, self.n, self.k, self.j
if self.firstcall:
self.firstcall = 0
return indices
if j < 0:
return None
new = indices[j] = indices[j] + 1
if j+1 == k:
if new + 1 == n:
j = j-1
else:
if new + 1 < indices[j+1]:
indices[j:] = range(new, new + k - j)
j = k-1
else:
j = j-1
self.j = j
# assert j == self.__findj(indices)
return indices
def getgray(self):
"""Return next (c, tossed, added) triple.
c is the next k-combination in a particular Gray code order.
tossed is the element of range(n) removed from the last
combination.
added is the element of range(n) added to the last
combination.
tossed and added are None if this is the first call, or on
every call if there is only one k-combination. Else tossed !=
added, and neither is None.
Caution: getgray wraps around if you invoke it more than
comb(n, k) times.
Caution: getgray returns the *same* list each time, mutated
in place. Don't mutate it yourself, or save a reference to it
(the next call will mutate its contents; make a copy if you
need to save the value across calls).
"""
# The popular routine in Nijenhuis & Wilf's "Combinatorial
# Algorithms" is exceedingly complicated (although trivial
# to program with recursive generators!).
#
# Instead I'm using a variation of Algorithm A3 from the paper
# "Loopless Gray Code Algorithms", by T.A. Jenkyns (Brock
# University, Ontario). The code is much simpler, and,
# because it's loop-free, takes O(1) time on each call (not
# just amortized over the whole sequence).
#
# Because the paper doesn't yet seem to be well known, here's
# the idea: Modify the definition of lexicographic ordering
# in a funky way: in the element comparisons, replace "<" by
# ">" in every other element position starting at the 2nd.
# IOW, and skipping end cases, sequence s is "less than"
# sequence t iff their elements are equal up until index
# position i, and then s[i] < t[i] if i is even, or s[i] >
# t[i] if i is odd. Jenkyns calls this "alternating
# lexicographic" order. It's clear that this defines a total
# ordering. What isn't obvious is that it's also a Gray code
# ordering! Very pretty.
#
# Modifications made here to A3 are minor, and include
# switching from 1-based to 0-based; allowing for trivial
# sequences; allowing for wrap-around; returning the "tossed"
# and "added" elements; starting the generation at an
# arbitrary k-combination; and sharing a finger (self.j) with
# the getlex method.
indices, n, k, j = self.indices, self.n, self.k, self.j
if self.firstcall:
self.firstcall = 0
return indices, None, None
# Slide over to first slot that *may* be able to move down.
# Note that this leaves odd j alone (including -1!), and may
# make j equal to k.
j = j | 1
if j == k:
# k is odd and so indices[-1] "wants to move up", and
# indices[-1] < n-1 so it *can* move up.
tossed = indices[-1]
added = indices[-1] = tossed + 1
j = j-1
if added == n-1:
j = j-1
elif j < 0:
# indices has the last value in alt-lex order, e.g.
# [4, 5, 6, 7]; wrap around to the first value, e.g.
# [0, 5, 6, 7].
assert indices == range(n-k, n)
if k and indices[0]:
tossed = indices[0]
added = indices[0] = 0
j = 0
else:
# comb(n, k) is 1 -- this is a trivial sequence.
tossed = added = None
else:
# 0 < j < k (note that 0 < j because j is odd).
# Want to move this slot down (again because j is odd).
atj = indices[j]
if indices[j-1] + 1 == atj:
# can't move it down; move preceding up
tossed = atj - 1 # the value in indices[j-1]
indices[j-1] = atj
added = indices[j] = n-k+j
j = j-1
if atj + 1 == added:
j = j-1
else:
# can move it down
tossed = atj
added = indices[j] = atj - 1
if j+1 < k:
tossed = indices[j+1]
indices[j+1] = atj
j = j+1
self.j = j
# assert j == self.__findj(indices)
return indices, tossed, added
def set_start(self, start):
"""Force .getlex() or .getgray() to start at given value.
start is a vector of k unique integers in range(n), where
k and n were passed to the CombGenBasic constructor.
The vector is sorted in increasing order, and is used as the
the next k-combination to be returned by .getlex() or
.getgray().
>>> gen = CombGenBasic(3, 2)
>>> for i in range(4):
... print gen.getgray()
([0, 1], None, None)
([1, 2], 0, 2)
([0, 2], 1, 0)
([0, 1], 2, 1)
>>> gen.set_start([0, 2])
>>> for i in range(4):
... print gen.getgray()
([0, 2], None, None)
([0, 1], 2, 1)
([1, 2], 0, 2)
([0, 2], 1, 0)
"""
if len(start) != self.k:
raise ValueError("start vector not of length " + `k`)
indices = start[:]
indices.sort()
seen = {}
# Verify the vector makes sense.
for i in indices:
if not 0 <= i < self.n:
raise ValueError("start vector contains element "
"not in 0.." + `self.n-1` +
": " + `i`)
if seen.has_key(i):
raise ValueError("start vector contains duplicate "
"element: " + `i`)
seen[i] = 1
self.indices = indices
self.j = self.__findj(indices)
self.firstcall = 1
def getrand(self, random=random.random):
"""Return a k-combination at random.
Optional arg random specifies a no-argument function that
returns a random float in [0., 1.). By default, random.random
is used.
"""
# The trap to avoid is doing O(n) work when k is much less
# than n. Letting m = min(k, n-k), we actually do Python work
# of O(m), and C-level work of O(m log m) for a sort. In
# addition, O(k) work is required to build the final result,
# but at worst O(m) of that work is done at Python speed.
n, k = self.n, self.k
complement = 0
if k > n/2:
# Generate the values *not* in the combination.
complement = 1
k = n-k
# Generate k distinct random values.
result = {}
for i in xrange(k):
# The expected # of times thru the next loop is n/(n-i).
# Since i < k <= n/2, n-i > n/2, so n/(n-i) < 2 and is
# usually closer to 1: on average, this succeeds very
# quickly!
while 1:
candidate = int(random() * n)
if not result.has_key(candidate):
result[candidate] = 1
break
result = result.keys()
result.sort()
if complement:
# We want everything in range(n) that's *not* in result.
avoid = result
avoid.append(n)
result = []
start = 0
for limit in avoid:
result.extend(range(start, limit))
start = limit + 1
return result
class CombGen:
def __init__(self, seq, k):
n = len(seq)
if not 0 <= k <= n:
raise ValueError("k must be in 0.." + `n` + ": " + `k`)
self.seq = seq
self.base = CombGenBasic(n, k)
def reset(self):
"""Restore state to that immediately after construction."""
self.base.reset()
def getlex(self):
"""Return next (in lexicographic index order) k-combination.
Return None if all possibilities have been generated.
"""
indices = self.base.getlex()
if indices is None:
return None
else:
return self.__indices2seq(indices)
def getgray(self):
"""Return next (c, tossed, added) triple.
c is the next k-combination in a particular Gray code order.
tossed is the element of s removed from the last combination.
added is the element of s added to the last combination.
Caution: getgray wraps around if you invoke it more than
comb(len(s), k) times.
"""
indices, tossed, added = self.base.getgray()
if tossed is None:
return (self.__indices2seq(indices), None, None)
else:
return (self.__indices2seq(indices),
self.seq[tossed],
self.seq[added])
def set_start(self, start):
"""Force .getlex() or .getgray() to start at given value.
start is a vector of k unique integers in range(len(s)), where
k and s were passed to the CombGen constructor.
The vector is sorted in increasing order, and is used as a
vector of indices (into s) for the next k-combination to be
returned by .getlex() or .getgray().
>>> gen = CombGen("abc", 2)
>>> for i in range(4):
... print gen.getgray()
('ab', None, None)
('bc', 'a', 'c')
('ac', 'b', 'a')
('ab', 'c', 'b')
>>> gen.set_start([0, 2]) # start with "ac"
>>> for i in range(4):
... print gen.getgray()
('ac', None, None)
('ab', 'c', 'b')
('bc', 'a', 'c')
('ac', 'b', 'a')
>>> gen.set_start([0, 2]) # ditto
>>> print gen.getlex(), gen.getlex(), gen.getlex()
ac bc None
"""
self.base.set_start(start)
def getrand(self, random=random.random):
"""Return a k-combination at random.
Optional arg random specifies a no-argument function that
returns a random float in [0., 1.). By default, random.random
is used.
"""
return self.__indices2seq(self.base.getrand(random))
def __indices2seq(self, ivec):
assert len(ivec) == self.base.k, "else internal error"
seq = self.seq
result = seq[0:0] # an empty sequence of the proper type
for i in ivec:
result = result + seq[i:i+1]
return result
del random
#####################################################################
# Testing.
#####################################################################
def _verifycomb(n, k, comb, inbase, baseobj=None):
if len(comb) != k:
print "OUCH!", this, "should have length", k
# verify it's an increasing sequence of baseseq elements
lastelt = None
for elt in comb:
if not inbase(elt):
print "OUCH!", elt, "not in base seqeuence", n, k, comb
if not lastelt < elt:
print "OUCH!", elt, ">=", lastelt, n, k, comb
lastelt = elt
if baseobj:
# verify search finger is correct
cachedj = baseobj.j
truej = baseobj._CombGenBasic__findj(baseobj.indices)
if cachedj != truej:
print "OUCH! cached j", cachedj, "!= true j", truej, \
n, k, comb
def _testnk_gray(n, k):
start = "abcdefghijklmnopqrstuvwxyz"[:n]
def inbase(elt, start=start):
return elt in start
o = CombGen(start, k)
c = comb(n, k)
seen = {}
last, lastlist = None, None
for i in xrange(c+1):
this, tossed, added = o.getgray()
_verifycomb(n, k, this, inbase, o.base)
if seen.has_key(this) and i < c:
print "OUCH!", this, "seen before at", seen[this], n, k
seen[this] = i
thislist = list(this)
if (tossed is None) != (added is None):
print "OUCH! tossed and added None clash", tossed, \
added, n, k, last, this
if last is None:
last, lastlist = this, thislist
continue
if tossed is not None:
if tossed == added:
print "OUCH! tossed == added", tossed, added, \
n, k, last, this
lastlist.remove(tossed)
lastlist.append(added)
lastlist.sort()
elif c != 1:
print "OUCH! tossed None but comb(n, k) not 1", \
c, tossed, added, n, k, last, this
if lastlist != thislist:
print "OUCH! does not compute", n, k, tossed, added, \
last, this
last, lastlist = this, thislist
if last != start[:k]:
print "OUCH! didn't wrap around", n, k, last, this
# getgray is especially delicate, so hammer on it.
def _testgray():
"""
>>> _testgray()
testing getgray 0
testing getgray 1
testing getgray 2
testing getgray 3
testing getgray 4
testing getgray 5
testing getgray 6
testing getgray 7
testing getgray 8
testing getgray 9
testing getgray 10
testing getgray 11
testing getgray 12
"""
for n in range(13):
print "testing getgray", n
for k in range(n+1):
_testnk_gray(n, k)
# getlex is easier.
def _testnk_lex(n, k):
start = "abcdefghijklmnopqrstuvwxyz"[:n]
def inbase(elt, start=start):
return elt in start
o = CombGen(start, k)
c = comb(n, k)
last = None
for i in xrange(c):
this = o.getlex()
_verifycomb(n, k, this, inbase, o.base)
if not last < this:
print "OUCH! not lexicographic", last, this, n, k
last = this
this = o.getlex()
if this is not None:
print "OUCH! should have returned None", n, k, this
def _testlex():
"""
>>> _testlex()
testing getlex 0
testing getlex 1
testing getlex 2
testing getlex 3
testing getlex 4
testing getlex 5
testing getlex 6
testing getlex 7
testing getlex 8
"""
for n in range(9):
print "testing getlex", n
for k in range(n+1):
_testnk_lex(n, k)
import math
_math = math
del math
# This is a half-assed implementation, prone to overflow and/or
# underflow given "large" x or v. If they're both <= a few hundred,
# though, it's quite accurate. The main advantage is that it's
# self-contained.
def _chi_square_distrib(x, v):
"""x, v -> return probability that chi-square statistic <= x.
v is the number of degrees of freedom, an integer >= 1.
x is a non-negative float or int.
"""
if x < 0:
raise ValueError("x must be >= 0: " + `x`)
if v < 1:
raise ValueError("v must be >= 1: " + `v`)
if v != int(v):
raise TypeError("v must be an integer: " + `v`)
if x == 0:
return 0.0
# (x/2)**(v/2) / gamma((v+2)/2) * exp(-x/2) *
# (1 + sum(i=1 to inf, x**i/prod(j=1 to i, v+2*j)))
# Alas, for even moderately large x or v, this is numerically
# intractable. But the mean of the distribution is v, so in
# practice v will likely be "close to" x. Rewrite the first
# line as
# (x/2/e)**(v/2) / gamma((v+2)/2) * exp(v/2-x/2)
# Now exp is much less likely to over or underflow. The power is
# still a problem, though, so we compute
# (x/2/e)**(v/2) / gamma((v+2)/2)
# via repeated multiplication.
x = float(x)
a = x / 2 / _math.exp(1)
v = float(v)
v2 = v/2
if int(v2) * 2 == v:
# v is even
base = 1.0
i = 1.0
else:
# v is odd, so the gamma bottoms out at gamma(.5) = sqrt(pi),
# and we need to get a sqrt(a) factor into the numerator
# (since v2 "ends with" .5).
base = 1.0 / _math.sqrt(a * _math.pi)
i = 0.5
while i <= v2:
base = base * (a / i)
i = i + 1.0
base = base * _math.exp(v2 - x/2)
# Now do the infinite sum.
oldsum = None
sum = base
while oldsum != sum:
oldsum = sum
v = v + 2.0
base = base * (x / v)
sum = sum + base
return sum
def _chisq(observed, expected):
n = len(observed)
assert n == len(expected)
sum = 0.0
for i in range(n):
e = float(expected[i])
sum = sum + (observed[i] - e)**2 / e
return sum, _chi_square_distrib(sum, n-1)
def _testrand():
"""
>>> _testrand()
random 0 combs of abcde
100
random 1 combs of abcde
a 99
b 106
c 98
d 99
e 98
probability[chisq <= 0.46] = 0.0227
random 2 combs of abcde
ab 100
ac 115
ad 111
ae 98
bc 98
bd 103
be 95
cd 84
ce 100
de 96
probability[chisq <= 6.6] = 0.321
random 3 combs of abcde
abc 83
abd 119
abe 86
acd 88
ace 103
ade 94
bcd 107
bce 101
bde 112
cde 107
probability[chisq <= 12.78] = 0.827
random 4 combs of abcde
abcd 86
abce 99
abde 113
acde 101
bcde 101
probability[chisq <= 3.68] = 0.549
random 5 combs of abcde
abcde 100
"""
def drive(s, k):
print "random", k, "combs of", s
o = CombGen(s, k)
g = o.getrand
n = len(s)
def inbase(elt, s=s):
return elt in s
count = {}
c = comb(len(s), k)
for i in xrange(100 * c):
x = g()
_verifycomb(n, k, x, inbase)
count[x] = count.get(x, 0) + 1
items = count.items()
items.sort()
for x, i in items:
print x, i
if c > 1:
observed = count.values()
if len(observed) < c:
observed.extend([0] * (c - len(observed)))
x, p = _chisq(observed, [100]*c)
print "probability[chisq <= %g] = %.3g" % (x, p)
for k in range(6):
drive("abcde", k)
__test__ = {"_testgray": _testgray,
"_testlex": _testlex,
"_testrand": _testrand}
def _test():
import doctest, combgen
doctest.testmod(combgen)
if __name__ == "__main__":
_test()
--Boundary_(ID_K+WtxWU4+owRK5lyzEc5lw)--