[Python-Dev] Regression in unicodestr.encode()?

Thu, 11 Apr 2002 02:06:19 -0400

[Tim]
> ...
> If you run Barry's test under a debug build, a call to pymalloc's
> realloc complains immediately upon entry that the passed-in address
> suffered overwrites ...

This deserves emphasizing because the debug pymalloc is new:  we've had
two(!) memory corruption problems since this has been available, and the
debug malloc was a real help both times.

This particular case was a best case:  the debug realloc detected the
corruption almost immediately after the overwrite occurred, and called
Py_FatalError() after printing some helpful clues.  But note that the
"serial number" it printed was insane:

    the block was made by call #1852047475 to debug malloc/realloc

That's because the overwrite was *so* bad it corrupted bytes beyond the end
of the 4 trailing "forbidden bytes", and that's where the serial number is
stored by the debug pymalloc.  We'll all be much happier if you stick to
modest off-by-1 fatal errors in the future <wink -- and note that it can
catch off-by-1 on the nose:  if you ask for 37 bytes, it can catch you
writing into p[37] (alignment isn't an issue for this gimmick)>.

The other case was the gc-versus-trashcan disaster.  The debug pymalloc
didn't catch the corruption directly, but, when things blew up, it was dead
obvious in the debugger that gc was crawling over an already-free()ed object
(the object fields were entirely filled with pymalloc's "dead byte" value,
0xdb, which the debug pymalloc free() sprays into the released memory
block).

So this is a powerful low-tech tool.  If you want to become a wizard at it
fast, deliberately provoke some object memory management errors in your
source tree, and just play with what happens then in a debug build.