[Numpy-discussion] Fancier indexing

Thu May 22 12:15:14 EDT 2008

On Thu, May 22, 2008 at 9:08 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs <jacobs at bioinformed.com>
> <bioinformed at gmail.com> wrote:
>> After poking around for a bit, I was wondering if there was a faster method
>> for the following:
>>
>> # Array of index values 0..n
>> items = numpy.array([0,3,2,1,4,2],dtype=int)
>>
>> # Count the number of occurrences of each index
>> counts = numpy.zeros(5, dtype=int)
>> for i in items:
>>   counts[i] += 1
>>
>> In my real code, 'items' contain up to a million values and this loop will
>> be in a performance critical area of code.  If there is no simple solution,
>> I can trivially code this using the C-API.
>
> How big is n? If it is much smaller than a million then loop over that instead.

Or how about using a list instead:

>> items = [0,3,2,1,4,2]
>> uitems = frozenset(items)
>> count = [items.count(i) for i in uitems]
>> count
   [1, 1, 2, 1, 1]