Programming challenge: wildcard exclusion in cartesian products

Thu Mar 16 17:41:14 EST 2006

in k:

cp:{[c;n;p]+(n#c)_vs(!_ c^n)_dvl,/{2_sv+(,/,/:\:)/(),/:@[x;&x=-1;:[;!c]]}'p}

examples:

  cp[2;3;,0 -1 1]
(0 0 0
 0 1 0
 1 0 0
 1 0 1
 1 1 0
 1 1 1)

  cp[2;3;(0 -1 1;1 -1 0)]
(0 0 0
 0 1 0
 1 0 1
 1 1 1)

  cp[2;3;(0 -1 1;1 -1 1)]
(0 0 0
 0 1 0
 1 0 0
 1 1 0)

arguments of cp:

c = cardinality of the input set
n = power
p = list of patterns (-1 = wildcard)

the algorithm directly computes the target set.  in other words,
it does not generate the set, then filter the matches from the
target.

modifying cp to accept s instead of the cardinality of s,
patterns expressed in terms of elements of s, &c. adds nothing
of interest to the problem.

<wkehowski at cox.net> wrote in message news:1142507663.796075.212430 at v46g2000cwv.googlegroups.com...
> The python code below generates a cartesian product subject to any
> logical combination of wildcard exclusions. For example, suppose I want
> to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
> '*a*b*' and '*c*d*a*'. See below for details.
>
> CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in
> a CAS like maple or mathematica.
>
> #-------------------------------------------------------------------------------
> # Short algorithm description
> # using function _genAll the program generates
> # cartesian product without sets, which match
> # some wildcarts
> # Sets generation uses recursion ->
> # first of all sets will be generated with dimension 1 and than
> filtered through wildcarts
> # then sets will be generated with dimension 2 and filtered again
> # until the required set dimension is reached
> # Program avoids explicit generation of some part of CP sets
> # if the end of whildcart is asterics (*) and if the first part of
> whildcart (without astrics)
> # matches current set => then this set will be filtered out and won't
> be used in
> # higher dimension set generation
> # example *,1,*,2,* [1,2] dim = 10
> # by dimension 2 only arrays [1,1],[2,1],[2,2] are will be generated
> # => array [1,2] won't be used in next recursion levels
> #-------------------------------------------------------------------------------
> # To obtaine result use function
> # CPWithoutWC first parameter is a list of any elements
> (char,int,string,class exemplar ,.... any type)
> # secont param is CP dimension
> # other parameters are wildcarts => lists with any values then may
> include
> # special value ALL - asterics equivalent
> #Example of usage: command line
> # >>> import cartesianProduct as cp
> # >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2]):
> #         print i
> # [1, 1, 1]
> # [1, 2, 1]
> # [2, 1, 1]
> # [2, 1, 2]
> # [2, 2, 1]
> # [2, 2, 2]
> # >>> for i in
> cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
> #         print i
> # ['a', 'a', 'a']
> # ['a', 'b', 'a']
> # ['b', 'a', 'b']
> # ['b', 'b', 'b']
> # >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2],[2,cp.ALL,1]):
> #        print i
> # [1, 1, 1]
> # [1, 2, 1]
> # [2, 1, 2]
> # [2, 2, 2]
> # >>>
> # >>> for i in cp.CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1]):
> #        print i
> ##  execute immediately
> # >>>
> # if You don't want to print cp. before ALL and CPWithoutWC use import
> like this:
> # from cartesianProduct import ALL,CPWithoutWC
> # CPWithoutWC is a python generator. Which means that it returns values
>
> # immediately and generate next in next cycle.
> # Program example
> #
> ## from cartesianProduct import ALL,CPWithoutWC
> ## def main():
> ##     for i in
> cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
> ##         ## do what You want with current value
> ##         .........
> ##         ## go back to for statement and generate new
> ## if __name__ == "__main__":
> ##     main()
> #
> """
>  Using logical combinations of WC:
>  1) It's possible to pass on to the function CPWithoutWC
>    any number of wildcarts after first two parameters, for example:
>    CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1],...)
>    where ... - is any other wildcart's additional function parameters.
>    Number of additional WC is not limited.
>    Function will filter out all combinations, which match any passed on
> WC.
>    It's equal to WC1 | WC2 | .... , where | is python analog of OR
> logical operations.
>  2) To use more complex WC combinations follow these steps
>    a) First of all create all needed WC
>    b) Then use operators |, & and braces () to create combinations
> required and then pass it on to function
>    CPWithoutWCEx as the third parameter. Don't use "or" and "and"
> python statement, otherwise program will
>    work improper. First two parameters of this function are the same as
> of CPWithoutWC function - set of
>    elements and CP dimension. An example of what was described above in
> command line:
>    >>> from cartesianProduct import ALL,CPWithoutWC,CPWithoutWCEx,WC
>    >>> a = WC([ALL,1,ALL])
>    >>> b = WC([ALL,2,ALL])
>    >>> c = a & b #filter out all sets which match a and b
>    >>> for i in CPWithoutWCEx([1,2],3,c) : print i
>    [1, 1, 1]
>    [2, 2, 2]
>    >>> # all sets where both 1 and 2 are present will be filtered out
>    >>> d = a | b
>    >>> for i in CPWithoutWCEx([1,2],3,d) : print i
>    >>> # returns nothing
>    >>> for i in CPWithoutWCEx([1,2,3],3,d) : print i
>    [3, 3, 3]
>    >>> a = WC([2,1,ALL])
>    >>> b = WC([1,2,ALL])
>    >>> c = WC([ALL,2])
>    >>> d = ( a | b ) & c
>    >>> for i in CPWithoutWCEx([1,2],3,d) : print i
>    [1, 1, 1]
>    [1, 1, 2]
>    [1, 2, 1]
>    [2, 1, 1]
>    [2, 2, 1]
>    [2, 2, 2]
>    >>> # filters out all combinations which start with [1,2] or [2,1]
> and end with 2
>
>    Number of WC, which are used to form logical combinations is not
> limited.
> """
> """
> 13.02.2006
>     a)Two new function - CPWithoutWCEx_L and CPWithoutWC_L are added.
>     Their interface is the same as of CPWithoutWCEx and CPWithoutWC
>     accordingly, except that the third parameter is WC list and
>     they accept strictly three parameters.
>
>     As You can see these functions are very simple because
>     python is quite flexible =>
>     >>> def s(x,y): return x * y
>     >>> d = [3,2]
>     >>> s(*d) ## == s(3,2)
>     6
>
>     b)Now WC can take string as parameter, and You can use string
>     as parameters of functions CPWithoutWC and CPWithoutWC_L
>     instead of WC lists.
>       Strings transform into WC according to these rules
>       1)If first symbol in the string is
>      alphanumeric (a-z or A-Z or 0-9) or '*'
>       character the every character of the string will be recognized as
>     a distinct set element. Examples:
>        "ad*d*" == ['a','d',cp.ALL,'d',cp.ALL]
>        "*A*b3*%^('" == [cp.ALL,'A',cp.ALL.'b','3',cp.ALL,'%','(',"'"]
>       2)If first character is not (alphanumeric or '*')
>       it will be treated as a delimitator. Examples:
>        ":a:A:1:*" == ['a','A','1',cp.ALL]
>        ":aA1:*"   == ['aA1',cp.ALL]
>        it's not necessary to write delimitators around the asterics
>       ":aA1*"     == ['aA1',cp.ALL]
>       "%aA%1*"    == ['aA','1',cp.ALL]
>       3)If all non delimit and non asterics character in elements
>      are digits => they will be treated as numbers.Examples:
>        "123*"     == [1,2,3,cp.ALL]
>        ":12:3*"   == [12,3,cp.ALL]
>        but
>        ":12:a:3*" == ['12','a','3',cp.ALL]
>       Examples of use:
> >>> for i in cp.CPWithoutWC(['a','b'],3,'a*b','b*a'):
>         print i
> ['a', 'a', 'a']
> ['a', 'b', 'a']
> ['b', 'a', 'b']
> ['b', 'b', 'b']
> >>> for i in cp.CPWithoutWC_L(['a','b'],3,['a*b','b*a']):
>         print i
> ['a', 'a', 'a']
> ['a', 'b', 'a']
> ['b', 'a', 'b']
> ['b', 'b', 'b']
> #You can mixe strings and lists for wildcarts
> >>> for i in cp.CPWithoutWC_L(['a','b'],3,['a*b',['b',cp.ALL,'a']]):
>         print i
> ['a', 'a', 'a']
> ['a', 'b', 'a']
> ['b', 'a', 'b']
> ['b', 'b', 'b']
> >>> for i in cp.CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
>       print i
> ['abc', 'abc', 'abc']
> ['abc', 'xyz', 'abc']
> ['xyz', 'abc', 'abc']
> ['xyz', 'abc', 'xyz']
> ['xyz', 'xyz', 'abc']
> ['xyz', 'xyz', 'xyz']
> """
> #-------------------------------------------------------------------------------
> class ALL(object):pass
> #-------------------------------------------------------------------------------
> class NO_ONE(object):pass
> #-------------------------------------------------------------------------------
> class BFunctor(object):
>   def __init__(self,func):
>     self.func = func
>   def __call__(self,*dt,**mp):
>     return self.func(*dt,**mp)
>   @classmethod
>   def OR(cls,x,y):
>     return cls(lambda *dt,**mp : x(*dt,**mp) | y(*dt,**mp))
>   @classmethod
>   def AND(cls,x,y):
>     return cls(lambda *dt,**mp : x(*dt,**mp) & y(*dt,**mp))
>
> #-----------------------------------------------------------------------------
>   def __or__(self,x):
>     return BFunctor.OR(self,x)
>
> #-----------------------------------------------------------------------------
>   def __and__(self,x):
>     return BFunctor.AND(self,x)
> #-------------------------------------------------------------------------------
> def _genAll(head,n,WCF,curr):
>   if len(curr) != 0 and n != 0:
>     for i in curr:
>       nhead = head + [i]
>       if n != 1 :
>         # needed dimension are not reached
>         # -> we mast tell WC that some other values
>         # may concatenate in the end of nhead in next recursion levels
>         # but if WC is ended with asterics (ALL), than dosn't matter
>         # so i use special walue NO_ONE to resolve this problem
>         # if WC with final asterics like [1,2,3,ALL] are matched nhead
> =>
>         # they matched nhead + [NO_ONE] to
>         # but if WC is like [1,ALL,2,3] => they dont match
> [1,2,3,NO_ONE] =>
>         # they don't prevent to generate [1,2,3,4] on next recursion
> level
>         x = WCF(nhead + [NO_ONE],curr)
>       else :      x = WCF(nhead,curr)
>       if False == x:
>         if n == 1 : yield nhead
>         else:
>           for i in _genAll(nhead,n - 1,WCF,curr):
>             yield i
>   elif n == 0 :
>     yield head
> #-------------------------------------------------------------------------------
> class WC(object):
>   def __init__(self,wc):
>     self.wc = wc
>     self.transformWC()
>     self.num_els = 0
>     self.compress()
>     self.comphdr = None
>     self.findMaxHeader()
>     self.ln = len(self.wc)
>
> #-----------------------------------------------------------------------------
>   def transformWC(self):
>     if self.wc.__class__ not in (str,unicode) : return
>     if len(self.wc) == 0 : return
>     if self.wc[0].isalnum() or self.wc[0] == "*":
>       wc = self.wc
>     else:
>       wc = self.wc[1:].split(self.wc[0])
>     nwc = []
>     for i in wc:
>       if   i == '*' : nwc.append(ALL)
>       elif '*' in i :
>         for j in i.split('*'):
>           if j : nwc.append(j)
>           nwc.append(ALL)
>         del nwc[-1]
>       else : nwc.append(i)
>     #check if all elements are numbers or *
>     allnum = True
>     for i in nwc:
>       if i is ALL : continue
>       try : int(i)
>       except :
>         allnum = False
>         break
>     if allnum:
>       for i,j in enumerate(nwc):
>         if j is not ALL:
>           nwc[i] = int(j)
>     self.wc = nwc
>
> #-----------------------------------------------------------------------------
>   def findMaxHeader(self):
>     return
>
> #-----------------------------------------------------------------------------
>   def compress(self):
>     "delete dublicated * values"
>     if len(self.wc) == 0 : return
>     wc_ = self.wc[:1]
>     for i in self.wc[1:]:
>       if i == ALL and i == wc_[-1] : continue
>       wc_.append(i)
>     self.wc = wc_
>
> #-----------------------------------------------------------------------------
>   def matchExact(self,hd,pos = 0):
>     if pos == len(self.wc) : return len(hd) == 0
>     if self.wc[pos] == ALL :
>       if pos + 1 == len(self.wc) : return True
>       vl = self.wc[pos + 1]
>       cpos = -1
>       while True:
>         try    : cpos = hd.index(vl,cpos + 1)
>         except : return False
>         if self.matchExact(hd[cpos + 1:],pos + 2) : return True
>     else:
>       if len(hd) == 0 : return False
>       if hd[0] != self.wc[pos] : return False
>       return self.matchExact(hd[1:],pos + 1)
>
> #-----------------------------------------------------------------------------
>   def __or__(self,x):
>     return BFunctor.OR(self,x)
>
> #-----------------------------------------------------------------------------
>   def __and__(self,x):
>     return BFunctor.AND(self,x)
>
> #-----------------------------------------------------------------------------
>   def __call__(self,hd,st):
>     return self.matchExact(hd)
> #-------------------------------------------------------------------------------
> def CPWithoutWCEx(set,n,wc):
>   for i in _genAll([],n,wc,set) :
>     yield i
> #-------------------------------------------------------------------------------
> def CPWithoutWC(set,n,*dt):
>   if len(dt) == 0 :
>     wc = lambda hd,st : True
>   else:
>     wc = WC(dt[0])
>     #print wc.wc
>     for i in dt[1:]:
>       wc = wc | WC(i)
>   for i in _genAll([],n,wc,set) :
>     yield i
> #-------------------------------------------------------------------------------
> def CPWithoutWC_L(set,n,WCs):
>   for i in CPWithoutWC(set,n,*WCs):
>     yield i
> #-------------------------------------------------------------------------------
> def CPWithoutWCEx_L(set,n,WCs):
>   for i in CPWithoutWCEx(set,n,*WCs):
>     yield i
> #-------------------------------------------------------------------------------
> def main():
>   for i in CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
>     print i
> #-------------------------------------------------------------------------------
> if __name__ == "__main__" : main()
> #-------------------------------------------------------------------------------
>