Programming challenge: wildcard exclusion in cartesian products

wkehowski at cox.net wkehowski at cox.net
Thu Mar 16 12:14:23 CET 2006


The python code below generates a cartesian product subject to any
logical combination of wildcard exclusions. For example, suppose I want
to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
'*a*b*' and '*c*d*a*'. See below for details.

CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in
a CAS like maple or mathematica.

#-------------------------------------------------------------------------------
# Short algorithm description
# using function _genAll the program generates
# cartesian product without sets, which match
# some wildcarts
# Sets generation uses recursion ->
# first of all sets will be generated with dimension 1 and than
filtered through wildcarts
# then sets will be generated with dimension 2 and filtered again
# until the required set dimension is reached
# Program avoids explicit generation of some part of CP sets
# if the end of whildcart is asterics (*) and if the first part of
whildcart (without astrics)
# matches current set => then this set will be filtered out and won't
be used in
# higher dimension set generation
# example *,1,*,2,* [1,2] dim = 10
# by dimension 2 only arrays [1,1],[2,1],[2,2] are will be generated
# => array [1,2] won't be used in next recursion levels
#-------------------------------------------------------------------------------
# To obtaine result use function
# CPWithoutWC first parameter is a list of any elements
(char,int,string,class exemplar ,.... any type)
# secont param is CP dimension
# other parameters are wildcarts => lists with any values then may
include
# special value ALL - asterics equivalent
#Example of usage: command line
# >>> import cartesianProduct as cp
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2]):
#         print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 1]
# [2, 1, 2]
# [2, 2, 1]
# [2, 2, 2]
# >>> for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
#         print i
# ['a', 'a', 'a']
# ['a', 'b', 'a']
# ['b', 'a', 'b']
# ['b', 'b', 'b']
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2],[2,cp.ALL,1]):
#        print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 2]
# [2, 2, 2]
# >>>
# >>> for i in cp.CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1]):
#        print i
##  execute immediately
# >>>
# if You don't want to print cp. before ALL and CPWithoutWC use import
like this:
# from cartesianProduct import ALL,CPWithoutWC
# CPWithoutWC is a python generator. Which means that it returns values

# immediately and generate next in next cycle.
# Program example
#
## from cartesianProduct import ALL,CPWithoutWC
## def main():
##     for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
##         ## do what You want with current value
##         .........
##         ## go back to for statement and generate new
## if __name__ == "__main__":
##     main()
#
"""
 Using logical combinations of WC:
 1) It's possible to pass on to the function CPWithoutWC
   any number of wildcarts after first two parameters, for example:
   CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1],...)
   where ... - is any other wildcart's additional function parameters.
   Number of additional WC is not limited.
   Function will filter out all combinations, which match any passed on
WC.
   It's equal to WC1 | WC2 | .... , where | is python analog of OR
logical operations.
 2) To use more complex WC combinations follow these steps
   a) First of all create all needed WC
   b) Then use operators |, & and braces () to create combinations
required and then pass it on to function
   CPWithoutWCEx as the third parameter. Don't use "or" and "and"
python statement, otherwise program will
   work improper. First two parameters of this function are the same as
of CPWithoutWC function - set of
   elements and CP dimension. An example of what was described above in
command line:
   >>> from cartesianProduct import ALL,CPWithoutWC,CPWithoutWCEx,WC
   >>> a = WC([ALL,1,ALL])
   >>> b = WC([ALL,2,ALL])
   >>> c = a & b #filter out all sets which match a and b
   >>> for i in CPWithoutWCEx([1,2],3,c) : print i
   [1, 1, 1]
   [2, 2, 2]
   >>> # all sets where both 1 and 2 are present will be filtered out
   >>> d = a | b
   >>> for i in CPWithoutWCEx([1,2],3,d) : print i
   >>> # returns nothing
   >>> for i in CPWithoutWCEx([1,2,3],3,d) : print i
   [3, 3, 3]
   >>> a = WC([2,1,ALL])
   >>> b = WC([1,2,ALL])
   >>> c = WC([ALL,2])
   >>> d = ( a | b ) & c
   >>> for i in CPWithoutWCEx([1,2],3,d) : print i
   [1, 1, 1]
   [1, 1, 2]
   [1, 2, 1]
   [2, 1, 1]
   [2, 2, 1]
   [2, 2, 2]
   >>> # filters out all combinations which start with [1,2] or [2,1]
and end with 2

   Number of WC, which are used to form logical combinations is not
limited.
"""
"""
13.02.2006
    a)Two new function - CPWithoutWCEx_L and CPWithoutWC_L are added.
    Their interface is the same as of CPWithoutWCEx and CPWithoutWC
    accordingly, except that the third parameter is WC list and
    they accept strictly three parameters.

    As You can see these functions are very simple because
    python is quite flexible =>
    >>> def s(x,y): return x * y
    >>> d = [3,2]
    >>> s(*d) ## == s(3,2)
    6

    b)Now WC can take string as parameter, and You can use string
    as parameters of functions CPWithoutWC and CPWithoutWC_L
    instead of WC lists.
      Strings transform into WC according to these rules
      1)If first symbol in the string is
     alphanumeric (a-z or A-Z or 0-9) or '*'
      character the every character of the string will be recognized as
    a distinct set element. Examples:
       "ad*d*" == ['a','d',cp.ALL,'d',cp.ALL]
       "*A*b3*%^('" == [cp.ALL,'A',cp.ALL.'b','3',cp.ALL,'%','(',"'"]
      2)If first character is not (alphanumeric or '*')
      it will be treated as a delimitator. Examples:
       ":a:A:1:*" == ['a','A','1',cp.ALL]
       ":aA1:*"   == ['aA1',cp.ALL]
       it's not necessary to write delimitators around the asterics
      ":aA1*"     == ['aA1',cp.ALL]
      "%aA%1*"    == ['aA','1',cp.ALL]
      3)If all non delimit and non asterics character in elements
     are digits => they will be treated as numbers.Examples:
       "123*"     == [1,2,3,cp.ALL]
       ":12:3*"   == [12,3,cp.ALL]
       but
       ":12:a:3*" == ['12','a','3',cp.ALL]
      Examples of use:
>>> for i in cp.CPWithoutWC(['a','b'],3,'a*b','b*a'):
        print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
>>> for i in cp.CPWithoutWC_L(['a','b'],3,['a*b','b*a']):
        print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
#You can mixe strings and lists for wildcarts
>>> for i in cp.CPWithoutWC_L(['a','b'],3,['a*b',['b',cp.ALL,'a']]):
        print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
>>> for i in cp.CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
      print i
['abc', 'abc', 'abc']
['abc', 'xyz', 'abc']
['xyz', 'abc', 'abc']
['xyz', 'abc', 'xyz']
['xyz', 'xyz', 'abc']
['xyz', 'xyz', 'xyz']
"""
#-------------------------------------------------------------------------------
class ALL(object):pass
#-------------------------------------------------------------------------------
class NO_ONE(object):pass
#-------------------------------------------------------------------------------
class BFunctor(object):
  def __init__(self,func):
    self.func = func
  def __call__(self,*dt,**mp):
    return self.func(*dt,**mp)
  @classmethod
  def OR(cls,x,y):
    return cls(lambda *dt,**mp : x(*dt,**mp) | y(*dt,**mp))
  @classmethod
  def AND(cls,x,y):
    return cls(lambda *dt,**mp : x(*dt,**mp) & y(*dt,**mp))

#-----------------------------------------------------------------------------
  def __or__(self,x):
    return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
  def __and__(self,x):
    return BFunctor.AND(self,x)
#-------------------------------------------------------------------------------
def _genAll(head,n,WCF,curr):
  if len(curr) != 0 and n != 0:
    for i in curr:
      nhead = head + [i]
      if n != 1 :
        # needed dimension are not reached
        # -> we mast tell WC that some other values
        # may concatenate in the end of nhead in next recursion levels
        # but if WC is ended with asterics (ALL), than dosn't matter
        # so i use special walue NO_ONE to resolve this problem
        # if WC with final asterics like [1,2,3,ALL] are matched nhead
=>
        # they matched nhead + [NO_ONE] to
        # but if WC is like [1,ALL,2,3] => they dont match
[1,2,3,NO_ONE] =>
        # they don't prevent to generate [1,2,3,4] on next recursion
level
        x = WCF(nhead + [NO_ONE],curr)
      else :      x = WCF(nhead,curr)
      if False == x:
        if n == 1 : yield nhead
        else:
          for i in _genAll(nhead,n - 1,WCF,curr):
            yield i
  elif n == 0 :
    yield head
#-------------------------------------------------------------------------------
class WC(object):
  def __init__(self,wc):
    self.wc = wc
    self.transformWC()
    self.num_els = 0
    self.compress()
    self.comphdr = None
    self.findMaxHeader()
    self.ln = len(self.wc)

#-----------------------------------------------------------------------------
  def transformWC(self):
    if self.wc.__class__ not in (str,unicode) : return
    if len(self.wc) == 0 : return
    if self.wc[0].isalnum() or self.wc[0] == "*":
      wc = self.wc
    else:
      wc = self.wc[1:].split(self.wc[0])
    nwc = []
    for i in wc:
      if   i == '*' : nwc.append(ALL)
      elif '*' in i :
        for j in i.split('*'):
          if j : nwc.append(j)
          nwc.append(ALL)
        del nwc[-1]
      else : nwc.append(i)
    #check if all elements are numbers or *
    allnum = True
    for i in nwc:
      if i is ALL : continue
      try : int(i)
      except :
        allnum = False
        break
    if allnum:
      for i,j in enumerate(nwc):
        if j is not ALL:
          nwc[i] = int(j)
    self.wc = nwc

#-----------------------------------------------------------------------------
  def findMaxHeader(self):
    return

#-----------------------------------------------------------------------------
  def compress(self):
    "delete dublicated * values"
    if len(self.wc) == 0 : return
    wc_ = self.wc[:1]
    for i in self.wc[1:]:
      if i == ALL and i == wc_[-1] : continue
      wc_.append(i)
    self.wc = wc_

#-----------------------------------------------------------------------------
  def matchExact(self,hd,pos = 0):
    if pos == len(self.wc) : return len(hd) == 0
    if self.wc[pos] == ALL :
      if pos + 1 == len(self.wc) : return True
      vl = self.wc[pos + 1]
      cpos = -1
      while True:
        try    : cpos = hd.index(vl,cpos + 1)
        except : return False
        if self.matchExact(hd[cpos + 1:],pos + 2) : return True
    else:
      if len(hd) == 0 : return False
      if hd[0] != self.wc[pos] : return False
      return self.matchExact(hd[1:],pos + 1)

#-----------------------------------------------------------------------------
  def __or__(self,x):
    return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
  def __and__(self,x):
    return BFunctor.AND(self,x)

#-----------------------------------------------------------------------------
  def __call__(self,hd,st):
    return self.matchExact(hd)
#-------------------------------------------------------------------------------
def CPWithoutWCEx(set,n,wc):
  for i in _genAll([],n,wc,set) :
    yield i
#-------------------------------------------------------------------------------
def CPWithoutWC(set,n,*dt):
  if len(dt) == 0 :
    wc = lambda hd,st : True
  else:
    wc = WC(dt[0])
    #print wc.wc
    for i in dt[1:]:
      wc = wc | WC(i)
  for i in _genAll([],n,wc,set) :
    yield i
#-------------------------------------------------------------------------------
def CPWithoutWC_L(set,n,WCs):
  for i in CPWithoutWC(set,n,*WCs):
    yield i
#-------------------------------------------------------------------------------
def CPWithoutWCEx_L(set,n,WCs):
  for i in CPWithoutWCEx(set,n,*WCs):
    yield i
#-------------------------------------------------------------------------------
def main():
  for i in CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
    print i
#-------------------------------------------------------------------------------
if __name__ == "__main__" : main()
#-------------------------------------------------------------------------------




More information about the Python-list mailing list