On 12/12/2009 22:55, T J wrote:
Hi,
Suppose I have an array of shape: (n, k, k). In this case, I have n k-by-k matrices. My goal is to compute the product of a (potentially large) user-specified selection (with replacement) of these matrices. For example,
x = [0,1,2,1,3,3,2,1,3,2,1,5,3,2,3,5,2,5,3,2,1,3,5,6]
TJ, what are your n, k, len(x) ? _dotblas.dot is fast: dot( 10x10 matrices ) takes ~ 22 usec on my g4 ppc, which is ~ 15 clock cycles (700 MHz) per mem access * +. A hack to find repeated pairs (or triples ...) follows. Your sequence above has only (3,2) 4 times, no win. (Can someone give a probabilistic estimate of the number of non-overlapping pairs in N letters from an alphabet of size A ?) #!/usr/bin/env python # numpy-discuss 2009 12dec TJ repeated dot products from __future__ import division from collections import defaultdict import numpy as np __version__ = "2010 7jan denis" def pairs( s, Len=2 ): """ repeated non-overlapping pairs (substrings, subwords) "abracadabra" -> ab ra [[0 7] [2 9]], not br Len=3: triples, 4 ... """ # bruteforce # grow repeated 2 3 ... ? pairs = defaultdict(list) for j in range(len(s)-Len+1): pairs[ s[j:j+Len] ].append(j) min2 = filter( lambda x: len(x) > 1, pairs.values() ) min2.sort( key = lambda x: len(x), reverse=True ) # remove overlaps -- # (if many, during init scan would be faster) runs = np.zeros( len(s), np.uint8 ) run = np.ones( Len, np.uint8 ) run[0] = Len chains = [] for ovchain in min2: chain = [] for c in ovchain: if not runs[c:c+Len].any(): runs[c:c+Len] = run chain.append(c) if len(chain) > 1: chains.append(chain) return (chains, runs) #............................................................................... if __name__ == "__main__": import sys abra = "abracadabra" alph = 5 randlen = 100 randseed = 1 exec( "\n".join( sys.argv[1:] )) # Test= ... print "pairs( %s ) --" % abra print pairs( abra ) # ab [0, 7], br [2, 9]] print pairs( abra, 3 ) # abr [0, 7] np.random.seed( randseed ) r = np.random.random_integers( 1, alph, randlen ) chains, runs = pairs( tuple(r) ) npair = sum([ len(c) for c in chains ]) print "%d repeated pairs in %d random %d" % (npair, randlen, alph) # 35 repeated pairs in 100 random 5 (prob estimate this ?) # 25 repeated pairs in 100 random 10