[python-win32] Identify unique data from sequence array

otrov dejan.org at gmail.com
Wed Dec 22 13:11:43 CET 2010


Hi,
I failed in my first idea to solve this problem with matlab/octave, as I just started using this tools for data manipulation, and then thought to try python as more feature rich descriptive language and post this problem to python group I'm subscribed already

Let's consider this simple dictionary object (scipy array):

X = array([[1, 2],
           [1, 2],
           [2, 2],
           [3, 1],
           [2, 3],
           [1, 2],
           [1, 2],
           [2, 2],
           [3, 1],
           [2, 3],
           [1, 2],
           [1, 2],
           [2, 2],
           [3, 1],
           [2, 3],
           ...,
           [1, 2],
           [1, 2],
           [2, 2],
           [3, 1],
           [2, 3]]

I would like to extract repeated sequence data:

Y = array([[1, 2],
           [1, 2],
           [2, 2],
           [3, 1],
           [2, 3]]

as a result.

Arrays are consisted of 10^7 to 10^8 elements, and unique sequence consists of maximum 10^6 elements, usually less like 10^5

Thanks for your time



More information about the python-win32 mailing list