[Python-Dev] RE: [Zope-Coders] core dump in Zope 2.7 test suite

Tue Sep 16 14:13:30 EDT 2003

[Tim]
> ...
> If some internal Unicode operation decided to allocate a short Unicode
> string, but freed it before filling in any of the string bytes, I
> suppose the keepalive optimization would retain a Unicode object with
> uninitialized str space on the unicode free list.  A subsequent
> _PyUnicode_New could grab that and try to boost its ->str size.  That
> could explain it.

And that turned out to be the case.  One example (there are more) is in
PyUnicode_DecodeASCII():  the local

    PyUnicodeObject *v;

gets initialized:

    v = _PyUnicode_New(size);

Suppose size is 1.  Suppose the string coming in is "\xc8".  The first
iteration of the loop sets a "ordinal not in range(128)" error, and jumps to
onError:

 onError:
    Py_XDECREF(v);

unicode_dealloc() then stuffs v on unicode_freelist, and because the length
"is small" (size == 1), v->str[] is retained, still holding uninitialized
heap trash.

A later _PyUnicode_New() grabs this off unicode_freelist, decides to boost
the str space via unicode_resize(), and the latter blows up in the

    	 unicode_latin1[unicode->str[0]] == unicode)) {

check because unicode->str[0] happens to be gigantically negative on
Jeremy's box, and the preceding

	 unicode->str[0] < 256 &&

check is too weak.  Or is there an implicit assumption that Py_UNICODE is
always an unsigned type (in which case, why isn't the literal 256U?; and in
which case, it doesn't seem to be true on Jeremy's box).