[Numpy-discussion] converting discrete data to unique integers

David Warde-Farley dwf at cs.toronto.edu
Wed Nov 4 15:09:45 EST 2009


Suppose I have an array 'd'

In [75]: d
array(['parrot', 'parrot', 'dog', 'cat', 'parrot', 'dog', 'parrot',  
        'dog', 'dog', 'dog', 'cat', 'cat', 'dog', 'cat', 'parrot',  
        'cat', 'dog', 'parrot', 'parrot', 'parrot', 'cat', 'dog',  
        'dog', 'dog', 'dog', 'dog', 'parrot', 'parrot', 'cat', 'dog',
        'parrot', 'cat', 'parrot', 'cat', 'dog', 'parrot', 'cat',  
        'cat', 'parrot', 'parrot', 'parrot', 'parrot', 'dog', 'cat',
        'parrot', 'cat'],

I'd like to map every unique element (these could be strings, objects,  
or already ints) to a unique integer between 0 and len(unique(d)) - 1.

The solution I've come up with is

In [76]: uniqueind, vectorind = (d == unique(d)[:, newaxis]).nonzero()

In [77]: myints = uniqueind[argsort(vectorind)]

But I wonder if there's a better way to do this. Anyone ever run into  
this problem before?


More information about the NumPy-Discussion mailing list