[Patches] [ python-Patches-1020188 ] Use Py_CLEAR where necessary to avoid crashes

SourceForge.net noreply at sourceforge.net
Thu Feb 15 11:14:44 CET 2007


Patches item #1020188, was opened at 2004-09-01 07:35
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1020188&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Private: No
Submitted By: Dima Dorfman (ddorfman)
Assigned to: Michael Hudson (mwh)
Summary: Use Py_CLEAR where necessary to avoid crashes

Initial Comment:
The Py_CLEAR macro was introduced to make it less tempting
to write incorrect code in the form

  Py_DECREF(self->field);
  self->field = NULL;

because that can expose a half-destroyed object if
deallocation can cause self->field to be read. This patch
fixes mistakes like this that still exist in the core.

Only cases that are potentially dangerous are fixed in this
patch. If self->field can only reference a special kind of
[builtin] object, then it's just a style bug because we know
that the builtin object isn't evil. Likewise if the code is
operating on an automatic variable. It might be nice to fix
those style bugs anyway, to discourage the broken idiom.


Just for kicks, here's a way to use this bug in reversed to
crash the interpreter:

  import array, gc, weakref
  a = array.array('c')
  wr = weakref.ref(a, lambda _: gc.get_referents(rev))
  rev = reversed(a)
  del a
  list(rev)

For me, this crashes immediately with a debug build and
after a couple tries otherwise.


Also attached is clearcand.py to help find these cases. It's
not comprehensive, but it's a start. Usage:

  $ find src -name '*.c' | python clearcand.py | fgrep -- '->'


The patch requires SF #1020185 to be applied for genobject.c
to compile without warnings.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2007-02-15 11:14

Message:
Logged In: YES 
user_id=21627
Originator: NO

What's the status of this? Does anybody want to incorporate the test
script?

There are still a few cases left (or readded since the patch was
originally made), in Modules (_bsddb, _sqlite, sv, _CarbonEvt), and in
unicodeobject.

mwh, if you lost interest, please unassign.

----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2004-09-27 20:50

Message:
Logged In: YES 
user_id=4771

The grep may miss other expressions, like Py_DECREF(items[index]).  There
is also the dreadful case of macros: Py_DECREF(x) used in a macro which is
called with self->something as x.  In other words I am not sure the script
can replace a (really motivated) programmer's extensive review...

----------------------------------------------------------------------

Comment By: Dima Dorfman (ddorfman)
Date: 2004-09-10 16:29

Message:
Logged In: YES 
user_id=908995

Okay, I'll post a cleaned up script this weekend. About your
improvements:

1. Py_CLEAR already has the same semantics as a would-be
   Py_XCLEAR--it doesn't do anything if its argument is
   null. Py_XDECREF should definitely be checked for,
   though.

2. Py_DECREF(x) is probably safe; this is what I meant when
   I talked about operating on automatic variables in the
   initial comment. The grep for "->" filters out matches
   that are safe in this manner. It should be noted that
   this isn't necessarily safe, just that it's almost
   certainly so. In particular, it's not safe in this case:

     PyObject *x = ...;
     /* self is a struct { PyObject **something; ... } */
     self->something = &x;
     Py_DECREF(x);
     x = NULL;

   but doing that is just insane.

   The grep would also miss ((*self).something), but I
   didn't expect to see any of those.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-09-10 16:09

Message:
Logged In: YES 
user_id=6656

Yeah, I've come to a similar conclusion (finally got round
to thinking about this properly this morning).  I think a
cleaned up script would be a good idea.

Two improvements (which you may well have made already):
Py_XDECREF is vulnerable too (do we want Py_XCLEAR??) and
Py_DECREF(x); x == NULL is ok AFAICT when x is an identifier.

----------------------------------------------------------------------

Comment By: Dima Dorfman (ddorfman)
Date: 2004-09-10 15:52

Message:
Logged In: YES 
user_id=908995

I didn't write specific tests for these since they really
would be testing for one particular line of code. It's not
really hard to write them, though; I think my sample using
reversed can be easily adapted to iterobject and itertools.
I'm not sure whether genobject can be exploited like this,
and I don't think it's worth finding out just to write a
test case.

On the other hand, if you (anyone) think it's worth it, I'll
clean up clearcand.py not to require a pipeline and it can
be added to the toolbox. For that to make sense, though, we
should probably fix the style bugs that it catches so it can
be reasonable to expect no output. Here are the outstanding
matches that dereference something using "->":

src/Modules/_bsddb.c 746 'Py_DECREF(self->myenvobj);\nself->myenvobj = 
NULL;'
src/Modules/_bsddb.c 781 'Py_DECREF(self->myenvobj);\nself->myenvobj = 
NULL;'
src/Modules/_bsddb.c 786 'Py_DECREF(self->associateCallback);
\nself->associateCallback = NULL;'
src/Modules/_bsddb.c 1184 'Py_DECREF(self->associateCallback);
\nself->associateCallback = NULL;'
src/Modules/svmodule.c 281 'Py_DECREF(self->ob_svideo);
\nself->ob_svideo = NULL;'
src/Mac/Modules/carbonevt/_CarbonEvtmodule.c 1060 
'Py_DECREF(_self->ob_callback);\n_self->ob_callback = NULL;'
src/Objects/unicodeobject.c 161 'Py_DECREF(unicode->defenc);
\nunicode->defenc = NULL;'
src/Objects/unicodeobject.c 250 'Py_DECREF(unicode->defenc);
\nunicode->defenc = NULL;'

unicode and sv are safe, and bsddb is on my list to look at.
It shouldn't be harmful to just change all of those to
Py_CLEAR, though. I can cobble up a patch for that, too.

Opinions?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-08 19:33

Message:
Logged In: YES 
user_id=80475

FWIW, rather than add specific testcases (closing the 
barndoor after the horse has escaped), it might be a good 
idea to fixup the OP's search code and add it to the toolbox.

Future errors of this type are better caught through code 
scans than by testing stuff that already works fine.

Of course, if highly motivated, you can do both :-)

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-09-01 14:32

Message:
Logged In: YES 
user_id=6656

Test cases would be nice.  If it's easy for someone whose thought 
about it, that would be great, if not assign to me and I'll get to it...

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-01 09:06

Message:
Logged In: YES 
user_id=80475

Accepted and applied.
See:
    Include/object.h 2.129
    Modules/itertoolsmodule.c 1.35
    Objects/enumobject.c 1.18
    Objects/genobject.c 1.4
    Objects/iterobject.c 1.19

Thanks for the patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1020188&group_id=5470


More information about the Patches mailing list