finding most common elements between thousands of multiple arrays.

Neil Crighton neilcrighton at gmail.com
Sat Jul 4 09:39:41 EDT 2009


You can join all your arrays into a single big array with concatenate.

>>> import numpy as np
>>> a = np.concatenate(array_of_arrays)

Then count the number of occurrances of each unique element using this trick
with searchsorted. This should be pretty fast.

>>> a.sort()
>>> unique_a = np.unique(a)
>>> count = []
>>> for val in unique_a:
...     count.append(a.searchsorted(val,side='right') - a.searchsorted(val,
side='left'))
>>> mostcommonvals = unique_a[np.argsort(count)[-25:]]


Neil




More information about the Python-list mailing list