[Python-bugs-list] [ python-Bugs-478534 ] SystemError with WeakKeyDictionary

noreply@sourceforge.net noreply@sourceforge.net
Mon, 10 Dec 2001 15:28:30 -0800


Bugs item #478534, was opened at 2001-11-05 19:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=478534&group_id=5470

Category: Python Interpreter Core
Group: Python 2.2
Status: Open
Resolution: Works For Me
Priority: 6
Submitted By: Sverker Nilsson (svenil)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: SystemError with WeakKeyDictionary

Initial Comment:
SystemError with WeakKeyDictionary

A SystemError is generated when trying
to iterate over a function returned from a
function that stored it in a WeakKeyDictionary.
The expected error was TypeError.

Sverker Nilsson

Examples:

This program gives a SystemError:

import weakref
ref = weakref.WeakKeyDictionary()

def encapsulate():
    f = lambda : ()
    ref[f] = None
    return f

for x in encapsulate():
 print x

Python 2.2b1 (#5, Oct 20 2001, 03:03:53) 
[GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
...
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/tmp/pythona04442", line 9, in ?
    for x in encapsulate():
SystemError: error return without exception set
>>> 

A variation of it, with a temporary variable,
gives the expected TypeError:

import weakref
ref = weakref.WeakKeyDictionary()

def encapsulate():
    f = lambda : ()
    ref[f] = None
    return f

g = encapsulate()
for x in g:
 print x


Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/tmp/pythona04442", line 10, in ?
    for x in g:
TypeError: iteration over non-sequence
>>>



----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-12-10 15:28

Message:
Logged In: YES 
user_id=31435

Guido sez:

"""
The correct rule is that if a deallocator calls into Python,
it must save and restore any pending exception condition
using PyErr_Fetch() and PyErr_Restore().
"""

I don't think it's that easy:  as bug 485153 pointed out, 
you can get into trouble just calling C API functions with 
a stale exception set.  The deallocator may call a C API 
function whose internals use Py_ErrOccurred() to detect 
whether what the latter calls raised an error.  But if an 
error is set upon entry to the deallocator, Py_ErrOccurred
() will trigger in the callees whether or not the error is 
fresh.  It's like leaving a stale non-zero errno value set 
when calling a function that checks errno after a call but 
doesn't clear errno before the call.

So I expect a more correct rule is:

If a deallocator makes any C API calls, then before making 
the first it must save any pending exception condition
using PyErr_Fetch(), clear the exception before making the 
C API call, and restore the pending exception via 
PyErr_Restore() after the last C API call -- unless it can 
prove the C API call can't be fooled by a pending exception 
(which it can't in theory, but is often true nevertheless 
in practice).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-10 14:03

Message:
Logged In: YES 
user_id=6380

[Fred]
> Essentially, deallocators (even those in C) should not be
> entered as long as an exception is set, and this is a case
> where it can happen.

Not true, IMO. If by deallocator you mean the function in
the tp_dealloc field of a type object, that gets called by
the Py_DECREF() macro, and Py_DECREF() gets called all the
time with a pending exception (lots of code does cleanup
involving Py_DECREF() before passing an exception on to its
caller).

The correct rule is that if a deallocator calls into Python,
it must save and restore any pending exception condition
using PyErr_Fetch() and PyErr_Restore().  This is how
__del__ finalizers get called. I suppose you could put such
a call around each of the PyObject_CallFunction(callback,
...) call in PyObject_ClearWeakRefs(), or you could puy it
around most of that function's body (since there are several
calls). Or you might have a flag variable that tells you
whether you have already called PyErr_Fetch(), and only call
PyErr_Restore() at the function exit when the flag is set.
Personally, this function looks ripe for refactoring...

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-12-10 13:31

Message:
Logged In: YES 
user_id=3066

Let me try to explain (part of) what is happening in a debug
build:

1. Object F is referenced only in the locals of a function;
   there is a weakref with a callback function for F.
2. An exception gets raised while that frame is still around.
3. When the locals of that frame are DECREF'd,
PyObject_ClearWeakRefs() is called for F.
4. PyObject_ClearWeakRefs() calls the callback function.
5. eval_frame() detects that an exception was raised without
a NULL return,
   and turns it into a real exception.
6. frame_dealloc() removes the locals, which causes
PyObject_ClearWeakRefs()
   to be called for F.
7. PyObject_ClearWeakRefs() sees the NULL return and calls
   PyErr_WriteUnraisable(), which (eventually) clears the
exception.
8. PyObject_ClearWeakRefs() returns, and the frame which
contained
   the reference to F continues to act as if there were an
exception.

Well, I don't think this should end up dumping core, but
clearly things are pretty messed up at this point. 
Essentially, deallocators (even those in C) should not be
entered as long as an exception is set, and this is a case
where it can happen.

I can add code to protect against this for callbacks
triggered by the weakref code, but that doesn't solve the
general problem of which this is a specific case.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-10 12:34

Message:
Logged In: YES 
user_id=6380

Fred, do you need help looking at this?  If this needs to be
fixed before the release, please raise priority to 7 or
higher.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-12-10 12:22

Message:
Logged In: YES 
user_id=3066

This is a really vicious bug.  I've simplified the test case
some more, but I'm still not sure what needs to be done to
fix this.  I'm not sure that I can make the test case
smaller; raising the exception is still required.  I expect
this is still a matter of an exception not being properly
detected and cleared at some point.

----------------------------------------------------------------------

Comment By: Sverker Nilsson (svenil)
Date: 2001-11-07 09:55

Message:
Logged In: YES 
user_id=356603

Seems the files didnt make it, sorry for the fuzz.. trying
again. S.

----------------------------------------------------------------------

Comment By: Sverker Nilsson (svenil)
Date: 2001-11-07 09:50

Message:
Logged In: YES 
user_id=356603

wrbug4.gdb-transcript contains a raw gdb transcript, in case
it can be of some use / Sverker

----------------------------------------------------------------------

Comment By: Sverker Nilsson (svenil)
Date: 2001-11-07 09:47

Message:
Logged In: YES 
user_id=356603

I couldnt figure out how to upload several files with the
same comment. Well, this is to tell you that wrbug4.mail
contains some more info. /Sverker

----------------------------------------------------------------------

Comment By: Sverker Nilsson (svenil)
Date: 2001-11-07 09:42

Message:
Logged In: YES 
user_id=356603

I have got the cvs version now. It also gives the error or
segmentation fault. Either compiled with Py_Debug enabled in
configuration, or with another version of my test program.
(The Py_Debug definition changes what Python does in
eval_frame so the error is revealed in the original test. )
See attached files: wrbug4.py, new test program.
wrbug4.mail, some info. wrbug4.gdb-transcript, a raw gdb
transcript... /Sverker

----------------------------------------------------------------------

Comment By: Sverker Nilsson (svenil)
Date: 2001-11-06 11:07

Message:
Logged In: YES 
user_id=356603

Ooops, in the example I sent you I hadnt removed
the self parameter after all. I  mixed up something.
But when I did remove the self parameter it went
as I said. Actually, I get a segmentation fault
running it directly from the shell. (When running
it from Emacs mode or importing it I get SystemError.)

Sverker

Shell transcript:

nicosys [228] python defenvbug3.py
first add ok
Segmentation fault
nicosys [229] cat defenvbug3.py
import weakref
ref = weakref.WeakKeyDictionary()

class MyError(Exception):
    pass

def add(x):
    raise MyError

def encapsulate():
    f = lambda : ()
    ref[f] = None
    return f

try:
    add(encapsulate())
except MyError: pass

print 'first add ok'

add(encapsulate())
# Here we would like to get Exception but gets SystemError
instead.


nicosys [230] python           
Python 2.2b1 (#5, Oct 20 2001, 03:03:53) 
[GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more
information.
>>> import defenvbug3.py
first add ok
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
SystemError: NULL result without error in PyObject_Call
>>> 
nicosys [231] whence python
/usr/local/bin/python is actually /hda7/local/bin/python
/usr/local/bin/python: ELF 32-bit LSB executable i386 (386
and up) Version 1
nicosys [232] ldd /hda7/local/bin/python
        libdl.so.2 => /lib/libdl.so.2 (0x40018000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x4001d000)
        libutil.so.1 => /lib/libutil.so.1 (0x40030000)
        libm.so.6 => /lib/libm.so.6 (0x40033000)
        libc.so.6 => /lib/libc.so.6 (0x40052000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2
(0x40000000)
nicosys [233] 


----------------------------------------------------------------------

Comment By: Sverker Nilsson (svenil)
Date: 2001-11-06 10:47

Message:
Logged In: YES 
user_id=356603

To answer your question, I am using 2.2b1,
from the tarball not from CVS.

In the second example, I get SystemError
from the first call to add actually. I
thought it was from the second one, but
since I changed from my own exception
class to Exception it became hidden. 

When I remove the uninteded self parameter
of add, I get SystemError in the second
call, as I thought I was getting.

But you get TypeError.. hmmm..

I suppose I should download the CVS then?..
Or maybe wait for 2.2b2..

Sverker

ps. Here's my updated second example, anyway..

import weakref
ref = weakref.WeakKeyDictionary()

class MyError(Exception):
    pass

def add(self, x):
    raise MyError

def encapsulate():
    f = lambda : ()
    ref[f] = None
    return f

try:
    add(encapsulate())
except MyError: pass

print 'first add ok'

add(encapsulate())
# Here we would like to get Exception but gets SystemError
instead.



----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-11-06 08:33

Message:
Logged In: YES 
user_id=3066

I cannot reproduce this using either code snippet.  Note
that the second example should (and does for me) raise
TypeError when calling add() -- you left in a "self" that
appears to be unintended.

Are you using 2.2b1 or CVS python?

----------------------------------------------------------------------

Comment By: Sverker Nilsson (svenil)
Date: 2001-11-05 23:17

Message:
Logged In: YES 
user_id=356603

The same problem now occured in another situation. I am
enclosing the condensed program. /Sverker

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=478534&group_id=5470