[Python-Dev] Showstopper
Tim Peters
tim.one@home.com
Sun, 15 Apr 2001 15:11:30 -0400
We've got a showstopper bug involving gc and dicts. It's not brand new, but
was probably introduced when we fiddled PyDict_Next() to stop the dict
resizing problems plaguing Jeremy.
Cut to the chase:
1. dict_items() is called. The dict has 22 of 32 slots filled.
2. PyList_New(22) creates the result list.
3. dict_items() loops through the dict looking for active entries.
It does *not* use PyDict_Next, but a loop of the form
for (i = 0, j = 0; i < mp->ma_size; i++) {
4. When it finds an active entry, it calls PyTuple_New(2) to
create the entry's (key, value) result tuple.
5. At the end, PyTuple_New() calls PyObject_GC_Init(), which
calls _PyGC_Insert().
6. _PyGC_Insert() decides it's time for a collection.
7. The dict dict_items() is iterating over (remember step #1 <wink>?)
is one of the objects gc traverses. gc dict traversal *does* use
PyDict_Next.
8. 22 of 32 slots filled is exactly at the dict resize point, so
the PyDict_Next() invoked by gc traversal grows the dict.
9. So, back to step #1, dict_item()'s
for (i = 0, j = 0; i < mp->ma_size; i++) {
loop now goes up to 64 instead of the original 32, and, because
of the internal dict reshuffling, *can* (depending on the exact
data content of the dict) see values again that it already
saw before the dict got rearranged.
10. As a result, the later
PyList_SetItem(v, j, item);
inside the loop can eventually pass values for j too large for
the list.
11. PyList_SetItem() sets a "list assignment index out of range"
error string, but nobody pays ttention to that, and dict_items()
proceeds to trigger the error multiple times.
12. The value returned by dict_items() is wrong, containing
some duplicate (key, value) pairs, and missing others.
13. The eval loop goes around a few more times, until it finally
hits a bytecode that notices the pending exception. Then (I
finally got lucky here!) IndexError finally pops up on a line
where an IndexError doesn't make sense.
I have a test case that reliably triggers this bug, but was unable to whittle
it below 100+ Kb of code + data files. So trust me about the details above
(they're clear enough!).