finding most common elements between thousands of multiple arrays.
Neil Crighton
neilcrighton at gmail.com
Sat Jul 4 09:39:41 EDT 2009
You can join all your arrays into a single big array with concatenate.
>>> import numpy as np
>>> a = np.concatenate(array_of_arrays)
Then count the number of occurrances of each unique element using this trick
with searchsorted. This should be pretty fast.
>>> a.sort()
>>> unique_a = np.unique(a)
>>> count = []
>>> for val in unique_a:
... count.append(a.searchsorted(val,side='right') - a.searchsorted(val,
side='left'))
>>> mostcommonvals = unique_a[np.argsort(count)[-25:]]
Neil
More information about the Python-list
mailing list