[Pythonmac-SIG] Fast way to get a list of unique values?
landauer@got.net
landauer@got.net
Mon, 9 Sep 2002 17:30:57 -0700
Quoting Robb Brown <brownr@ucalgary.ca>:
>
> I have a large array (Numeric) of integers. I'd like to get a list
> of all the unique values in that array. Naturally I'd like to avoid
> a for loop.
Here's one way. It does avoid using a for loop.
I'm guessing that it isn't quite exactly what you wanted.
Fun exercise, though. See below for a more serious suggestion.
(This is *Python*, after all...)
#!/usr/bin/env python
#
# Robb has a "large array (Numeric) of integers. I'd like to get
# a list of all the unique values in that array. Naturally I'd like
# to avoid a for loop. Is there an easy way to do this in Python?
#
# No mention of speed ...
#
from operator import __mul__
# --- from the cookbook, w/ slight mods --
# Sieve of Eratosthenes
# David Eppstein, UC Irvine, 28 Feb 2002
# From the ASPN Python cookbook ...
# http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/117119
#
from __future__ import generators
def eratosthenes ():
'''Yields the sequence of prime numbers via the Sieve of Eratosthenes.'''
D = {} # map composite integers to primes witnessing their compositeness
q = 2 # first integer to test for primality
while 1:
if q not in D:
yield q # not marked composite, must be prime
D[q*q] = [q] # first multiple of q not already marked
else:
map( lambda p: D.setdefault(p+q,[]).append(p), D[q] )
del D[q] # no longer need D[q], free memory
q += 1
primes_iter = eratosthenes()
primes_list = [ ]
prim_d = { }
def primes ( n ):
while len(primes_list) <= n:
nx = primes_iter.next()
primes_list.append( nx )
prim_d[ nx ] = len(primes_list)-1
return primes_list[n]
def uniq ( arr ):
# map the array into a goedel number of sorts
gm = min( arr )
gx = max( arr )
# offset the list to n in [ 0 .. max-min+1 ]
# and then find the nth prime
bias = map( lambda x: primes(x-gm), arr )
# blast all those primes together
goedel = reduce( __mul__, bias )
# find the ones that factor into goedel
fax = filter( lambda n: goedel%n == 0, primes_list )
# print len(arr), arr
# print bias
# print goedel
# print len(fax), fax
return map( lambda z: prim_d[z] + gm, fax )
if __name__ == "__main__":
def test_e ():
iter = eratosthenes()
lst = []
while 1:
ff = iter.next()
lst.append( ff )
print ff
if ff > 1000:
break
print lst[ -12: ]
def test_p ():
print primes( 12 )
def test ():
arr = [ -9, -2, -4, 0, 19, 2, 42, -9, -2, -4, 0, 19, 2, 421, 32, 42 ]
print uniq( arr )
test_e()
test_p()
test()
> Is there an easy way to do this in Python?
You do not tell us where you acquired that "large array".
My suggestion, given the lack of any other detail, would be to see if
it's possible to create that array as a dict instead (with the numbers
as keys, as Bob mentioned). I.e., do the unique-ifying while building
the array, instead of treating it as a separate problem.
-=-=-=-
food. shelter. clothing. net. Got.net - The Internet Connection, Inc