[Python-Dev] Worse news

Mon, 22 Jan 2001 02:54:32 -0500

I still don't have a clue about test_sax, but have stumbled into more
failure modes.  Most of them seem related to the SystemError ("'finally'
pops bad exception").  Around that part of ceval.c, sometimes the v popped
off the stack has a NULL type pointer, other times it's a pointer to a
damaged PyTuple_Type (for example, with a tp_dealloc field of 0x61, which
leads to an illegal instruction exception).

The MS debug heap routines fill all newly malloc'ed memory with 0Xcd ("clean
landfill"), fill free'ed memory with 0Xdd ("dead landfill"), and *pad*
malloc'ed memory with some number of 0xfd bytes on both sides ("no-man's
land").  The clean landfill and no-man's land patterns are showing up more
often they should "by chance", and especially in high-order bytes.  Just
more evidence of the obvious:  something is really screwed up <wink>.

I cannot get the subtest that test_sax is calling (test_expat_incomplete) to
fail in isolation.

Next headache:  If I delete all .pyc files from Lib/ and Lib/test/, and then
run:

python ../lib/test/regrtest.py -x test_sax

by hand, all the 98 tests that *should* run on Windows (excluding, of
course, test_sax, which is no longer tried) pass.  If I immediately run them
again (without deleting .pyc) by hand:

python ../lib/test/regrtest.py -x test_sax

then they again pass.  However, if I do

rt -x test_sax

which does exactly the steps (delete .pyc, run regrest excluding test_sax,
run regrtest again) via the little MS batch file rt.bat, then on the second
time thru regrtest, and 5 times out of 5, it died in test_extcall with an
"illegal operation", while executing

		if (TYPE(c) == DOUBLESTAR) {

near the end of symtable_params in compile.c.  This is an optimized build,
and the debugger has no idea what's in c at this point; to judge from the
offending machine instruction and register contents, though, c is a bad
pointer.

Have not been able to get test_extcall to fail in isolation.

Have also been unable to get test_extcall to fail in the debug build.

So there's evidence of Deep Rot beyond test_sax, but test_sax remains the
only test that fails every time and under both build types.

Running regrtest with -r (randomize test order) is also "interesting":
first time I tried that, test_cpickle failed (truncated output) as well as
test_sax.

I doubt anyone has run the tests more often than me over the last week, so
I'm not surprised I'm seeing the most problems.  However, since *nobody* is
seeing anything on Linux, I'd at least like to get *someone* else to run the
tests on Windows.  While I'm not having any unusual problems with my box,
it's certainly possible that I've got a corrupted file or a flaky memory
chip etc, or that MSVC is generating bad code for some recent change
(although that's unlikely since the debug build generates *really*
straightforward code).

Deleting my entire PCbuild subtree and refetching it from CVS didn't make
any difference.