[Python-Dev] PySet API

Raymond Hettinger raymond.hettinger at verizon.net
Sun Mar 26 19:24:19 CEST 2006


[Alex]
> Sure, accidentally mutating underlying iterables is a subtle (but  alas 
> frequent) bug, but I don't see why it should be any harsher when  the loop is 
> using a hypothetical PySet_Next than when it is using  PyIter_Next -- whatever 
> precautions the latter takes to detect the  bug and raise an exception instead 
> of crashing, wouldn't it be at  least as feasible for PySet_Next to take 
> similar precautions

The difference is that the PySet_Next returns pointers to the table keys and 
that the mutation occurs AFTER the call to PySet_Next, leaving pointers to 
invalid addresses.  IOW, the function cannot detect the mutation.

PyIter_Next on the other hand returns an object (not a pointer to an object such 
as those in the hash table).  If the table has mutated before the function is 
called, then it simply raises an exception instead of returning an object.  If 
the table mutates afterwards, it is no big deal because the returned object is 
still valid.

FWIW, here's an easier to understand example of the same ilk (taken from real 
code):

   s = PyString_AS_STRING(item);
   Py_DECREF(item);
   if (s == NULL)
    break;
   x = strtol(s, &endptr, 10);

The problem, of course, is that the decref can render the string pointer 
invalid.  The correct code moves the decref after the strtol() call and inside 
the conditional. This is at the core of the issue.  I don't want the set 
iteration API to return pointers inside the table.  The PyIter_Next API takes a 
couple more lines but is easy to get correct and has nice duck-typing 
properties.

For dicts, the _next api is worth the risk because it saves a double lookup and 
because there are legitimate use cases for changing the contents of the value 
field directly inside the hash table.  For sets, those arguments don't apply. 
We have a safe way that takes a couple more lines and a proposed 
second-way-to-do-it that is dangerously attractive, yet somewhat unsafe.  For 
that reason, I say no to PySet_Next().

Hopefully, as the module author and principal maintainer, I get some say in the 
matter.


Raymond


Nothing is more conducive to peace of mind than not having any opinions at all.
-- Georg Christoph Lichtenberg 



More information about the Python-Dev mailing list