Efficiency

Luis Leonel Lopez luisleonellopez at hotmail.com
Mon May 28 10:35:56 EDT 2001


Dear friends,

I wrote two functions which receive a sequence and return a list with
non-duplicate elements. As per "unique" function's Tim Peters (as is in
Python Cookbook:

def unique(s):
   """Return a list of the elements in s, but without duplicates.

   For example, unique([1,2,3,1,2,3]) is some permutation of [1,2,3],
   unique("abcabc") some permutation of ["a", "b", "c"], and
   unique(([1, 2], [2, 3], [1, 2])) some permutation of
   [[2, 3], [1, 2]].

   For best speed, all sequence elements should be hashable.  Then
   unique() will usually work in linear time.

   If not possible, the sequence elements should enjoy a total
   ordering, and if list(s).sort() doesn't raise TypeError it's
   assumed that they do enjoy a total ordering.  Then unique() will
   usually work in O(N*log2(N)) time.

   If that's not possible either, the sequence elements must support
   equality-testing.  Then unique() will usually work in quadratic
   time.
   """

   n = len(s)
   if n == 0:
       return []

   # Try using a dict first, as that's the fastest and will usually
   # work.  If it doesn't work, it will usually fail quickly, so it
   # usually doesn't cost much to *try* it.  It requires that all the
   # sequence elements be hashable, and support equality comparison.
   u = {}
   try:
       for x in s:
           u[x] = 1
   except TypeError:
       del u  # move on to the next method
   else:
       return u.keys()

   # We can't hash all the elements.  Second fastest is to sort,
   # which brings the equal elements together; then duplicates are
   # easy to weed out in a single pass.
   # NOTE:  Python's list.sort() was designed to be efficient in the
   # presence of many duplicate elements.  This isn't true of all
   # sort functions in all languages or libraries, so this approach
   # is more effective in Python than it may be elsewhere.
   try:
       t = list(s)
       t.sort()
   except TypeError:
       del t  # move on to the next method
   else:
       assert n > 0
       last = t[0]
       lasti = i = 1
       while i < n:
           if t[i] != last:
               t[lasti] = last = t[i]
               lasti += 1
           i += 1
       return t[:lasti]

   # Brute force is all that's left.
   u = []
   for x in s:
       if x not in u:
           u.append(x)
   return u

), my functions use the slowest method by brute force. I'd want to know why
and which of my functions is better or efficient and why.

def onlyOne1(s)
  r = []
  for x in s:
    if x not in r:
      r.append(x)
  return r

def onlyOne2(s):
  r = []
  for x in s:
    try:
      r.index(x)
    except:
      r.append(x)
  return r

Thank you in advance!

Luis Leonel Lopez
_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.





More information about the Python-list mailing list