2.3 list reverse() bug?

Bjorn Pettersen bjorn.pettersen at comcast.net
Sat Dec 27 19:04:52 CET 2003


Alex Martelli <aleax at aleax.it> wrote in news:QuWGb.9617$_P.378641
@news4.tin.it:

> Bjorn Pettersen wrote:
>    ...
>> There are of course times where a copy is needed (e.g. multi-version
>> scenarios) but most programmers don't bump into these very frequently...
> 
> Here's my candidate for "most frequent scenario where copies are needed":
> 
> for item in somelist:
>     if froobable(item):
>         somelist.append(frooble(item))
>     elif uncadable(item):
>         somelist.remove(item)
> 
> i.e.: you want to modify the very list you're looping on.  This is one
> of Python's few but important "traps and pitfalls".

That is an interesting version in that it iterates while either (1) 
appending or (2) removing. I.e. froobable(frooble(item)) must always be 
false to prevent infinite recursion.  I don't think I've seen one of these 
and it made me curious, do you have a concrete example?

The more frequently encountered algorithm is one the iterates while either 
(1) mutating or (2) removing items [and a gazzilion variations you can find 
in any Scheme/ML/Lisp/etc. introduction (current pet-peeve: aren't there 
interesting things you could do in an introduction to those languages?)].

First the version using a copy, pretty straight forward, but only able to 
deal with lists of mutable objects (see last solution below):

  # O(n**2)    
  for item in somelist[:]: # or copy.copy(somelist) or list(somelist)
      if shouldModify(item):
          mutate(item)  # assuming item is mutable in the Python sense
      elif shouldRemove(item):
          somelist.remove(item)
      else:
          pass

My first instinct would be a "functional" approach, creating a new correct 
list and abandoning the old incorrect version:

  # O(n)
  tmp = []
  for item in somelist:
      if shouldModify(item):
           tmp.append(mutate(item))
      elif shouldRemove(item):
           pass
      else:
           tmp.append(item)
  somelist = tmp

  ; it doesn't look quite so forced in [pseudo-]lisp [I didn't 
  ; count the parens or actually run it -- I'm sure there are
  ; enough lispers here to correct it <grin/wink>].
  (defun frooble (lst)
     (cond ((shouldModify (car lst)) 
               (cons (mutate (car lst)) (frooble (cdr lst)))
           ((shouldRemove (car lst))
               (frooble (cdr lst)))
           (t (cons (car lst) (frooble (cdr lst))))))


if I was programming in C++ or especially C, a pure imperative approach 
would probably bubble to the top (assuming the list could have variable 
size without leaking memory). The fact that I wrote a bug in this code 
originally, without noticing, would (at least for me) be filed in the 
"don't destroy what you're iterating over" category. Alternatively, it 
would indicate that you are using the wrong datastructure, i.e. an array, 
instead of some sort of linked list.

  # O(n**2)
  i = 0
  while i < len(somelist):
      if shouldModify(item):
          somelist[i] = mutate(item)
          i += 1
      elif shouldRemove(item):
          del somelist[i]
      else:
          i += 1

I thought I found a solution using enumerate, but it was a mirage ;-) So 
far the functional solution still seems like a winner, it's both faster 
(yeah, I know, see below), and easy to verify for correctness visually (in 
fact, it's set up for an inductive proof, if that exites anyone ;-)

Returning to the example in the beginning, the functional version would be

  tmp = []; tail = []
  for item in somelist:
      if froobable(item):
          tail.append(frooble(item))
      elif uncadable(item):
          pass
      else:
          tmp.append(item)
  somelist = tmp + tail

provided froobable(frooble(item)) is always false. If that is not the case, 
it would make a functional solution messier. (but I think that would be a 
rather perverse problem space ;-)

The case I've seen come up repeatedly is a little simple, only involving 
deletions of the list:

  x = range(30)
  for i in x:
      if 5 <= i <= 10: # try to remove 5,6,7,8,9,10
          x.remove(i)  

the above removes 6,8,10 instead. The solution doesn't require a copy to 
iterate over though, a list comprehension will do (as will calling filter 
if you want to go against the times)::

  x = [n for n in x if not 5 <= n <= 10]  

-- bjorn

(*)
# I realize there are quite a number of Python programmers that aren't
# CS majors. Big-O notation [ O(n) and O(n**2) above ] is simply a 
# measure of how bad the worst case scenario of a function would be 
# (most frequently) measured in operations, and for the cases above
# I'll make another absolute statement <wink> and say that so few 
# people will work with lists big enough to ever care that it's 
# irrellevant largely  :-)

The quatratic complexity comes from deleting an item in the middle of the 
list, since this causes all items above the empty spot to be shifted down. 
Shifting all elements down is O(n), and you could end up doing it for every 
element, <waving hands> O(n) work for each of the O(n) elements gives O(n**
2)</wave>







More information about the Python-list mailing list