[Python-bugs-list] [ python-Bugs-628246 ] Set constructor fails with NameError

noreply@sourceforge.net noreply@sourceforge.net
Fri, 08 Nov 2002 08:03:08 -0800


Bugs item #628246, was opened at 2002-10-24 16:52
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=628246&group_id=5470

Category: Python Library
Group: Python 2.3
Status: Open
Resolution: Fixed
Priority: 5
Submitted By: Jeremy Hylton (jhylton)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: Set constructor fails with NameError

Initial Comment:
Here is a toy demo of the problem:

def baditer():
    raise TypeError
    yield 1

import sets
sets.Set(baditer())

Traceback (most recent call last):
  File "/tmp/baditer.py", line 8, in ?
    sets.Set(baditer())
  File "/usr/local/lib/python2.3/sets.py", line 367, in
__init__
    self._update(iterable)
  File "/usr/local/lib/python2.3/sets.py", line 330, in
_update
    transform = getattr(element, "_as_immutable", None)
UnboundLocalError: local variable 'element' referenced
before assignment

The _update() method has a try/except for TypeError
that is trying to catch failures to place set elements
in the dictionary.  The try/except also wraps a for
loop over an iterator.  So if something goes wrong and
the iterator raises TypeError, the
failed-to-insert-element-in-dict code is run.

The obvious solution, it seems, is to have the
try/except inside the for loop so that it only catches
errors in dict insertion.  Tim points out that this
would be slow.

The example above is a toy, but I ran into this bug in
a real application where debugging was made harded
because the Set code was catching an exception when it
shouldn't.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-11-08 11:03

Message:
Logged In: YES 
user_id=31435

Guido, to what is this a counterexample?  It's a little too 
cryptic.

If you try sets.Set(C()), it raises TypeError under the code 
that's been checked in, and it would also raise TypeError 
under the suggestion (via "the code we've got now").

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-11-08 07:27

Message:
Logged In: YES 
user_id=6380

Counterexample:

class C:
    def __init__(self, n=10):
        self.n = 10
    def __iter__(self):
        return self
    def next(self):
        self.n -= 1
        if self.n < 0:
            raise StopIteration
        return self
    def __hash__(self):
        raise TypeError


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-11-08 02:52

Message:
Logged In: YES 
user_id=31435

Ah, of course -- that's what Jeremy saw the first time 
around!  Sorry about that.  Set element=it before the loop.  
Then UnboundLocalError is impossible, and it's also 
impossible for the iterator to leave element at that value 
(unless it never produces a value):

except TypeError:
    try:
        data[element] = value
    except TypeError:
        the code we've got now
    else:
        # TypeError raised by iterator
        if element is it:
            # iterator didn't produce any values;
            # get rid of the bogus entry we just added
            del element[it]
        raise

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-11-08 01:12

Message:
Logged In: YES 
user_id=80475

The line, "for element in it:" may fail immediately, 
before "element" is ever assigned.  The re-test of "date
[element] = value" then raises an UnboundLocalError.  

Did you want to trap that exception too?  The resulting 
code is somewhat ugly but is fast and clear about its 
intentions.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-11-08 00:27

Message:
Logged In: YES 
user_id=31435

Reopened.  The difficulty remaining appears to be that 
we're assuming TypeError was raised by the

    data[element] = value

part, when it *may* have been raised by the

    for element in it:

part.  If so, this could be addressed directly instead of via 
materializing the whole iterable into a list first.  Like:

except TypeError:
    try:
        data[element] = value
    except TypeError:
         the code we've got now
    else:
         raise   # must have come from iteration

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-11-08 00:05

Message:
Logged In: YES 
user_id=80475

Done.
See sets.py 1.32 and test/test_sets.py 1.14.
Marking as fixed and closing.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-11-07 20:06

Message:
Logged In: YES 
user_id=6380

Yikes! You meant type(iterable), not type(iter).

I expect this will get complaints from people who write
careful iterators.

But it seems the best compromise for now, so check it in
(with above fix). I doubt this is the last word though. :-(

It needs a unittest that tries this with each of the
mentioned types as well as with something that's not one of
them.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-11-07 18:07

Message:
Logged In: YES 
user_id=80475

Revised patch that avoids performance issues in cases 
where the iterator won't raise errors during iteration.

For the other cases, list(iter) is still faster than inserting a 
try/except inside the loop.

Please pronounce one way or the other and then re-assign 
to me.  I'll move the try/except back inside the loop if 
that's what you want.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-11-02 19:02

Message:
Logged In: YES 
user_id=6380

Hmm...  It's the performance waste of always making an
unnecessary copy (the argument will often already be a list,
or a tuple) vs. the performance waste of doing try/except
each iteration.

Maybe it should all be replaced by this?

        for element in iterable:
            transform = getattr(element, "_as_immutable", None)
            if transform is not None:
                element = transform()
            data[element] = value


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-10-31 23:09

Message:
Logged In: YES 
user_id=80475

Yes.  The intended behavior is to have list(iterable) run the 
iterable start to finish so that any exceptions raised will 
not be trapped by the Sets code during dictionary 
construction.

Feel free to disregard the idea. I thought it had some merit 
because it is simple, fast and fixes the problem.  Still, it 
looks weird and I won't lose any sleep if you toss the idea.





----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-10-31 21:03

Message:
Logged In: YES 
user_id=6380

That patch makes no sense -- it first materializes the
iterator as a list, then iterates over that. Was that what
you intended?

I'd rather give up on the speediness and put the try/except
where it belongs.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-10-25 23:39

Message:
Logged In: YES 
user_id=80475

A simple solution is to run through the iterator once before 
starting the update.  A one line patch is attached.

The only thing I don't like about it is that the code is 
starting to look hare-brained.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=628246&group_id=5470