[Python-Dev] Free threading and borrowing references from mutable types

Guido van Rossum guido@python.org
Tue, 11 Sep 2001 08:52:46 -0400


> Considering the free threading issue (again), I found that functions
> returning borrowed references are problematic if the container is
> mutable.
> 
> In traditional Python, extension modules could safely borrow
> references if they know that they maintain a reference to the
> container. If a thread switch is possible between getting the borrowed
> reference and using it, then this assumption is wrong: another thread
> may remove the reference from the container, so that the object dies.

Good point.  I hadn't though of this yet, but it's definitely yet
another problem facing free threading.

> Therefore, I propose to deprecate these functions. I'm willing to
> write a PEP elaborating on that if necessary, but I'd like to perform
> a quick poll beforehand
> - whether people think that deprecating these functions is reasonable
> - whether it is sufficient to only have their abstract.c equivalents,
>   or whether type-specific replacements that do return new references
>   are needed
> - what else I'm missing.

I'm personally not overly excited about free threading (Greg Stein
agrees that it slows down the single-threaded case and expects that it
will always remain optional).  Therefore I'm at best lukewarm about
this proposal.

But at a recent PythonLabs meeting, a very different motivation was
brought up to deprecate the type-specific APIs (all of them!): if
someone subclasses e.g. dictionary and overrides __getitem__, code
calling PyDict_GetItem on its instances can be considered wrong,
because it circumvents the additional processing in __getitem__ (where
e.g. case normalization or other forms of key mapping could affect the
outcome).  Because it returns a borrowed value, PyDict_GetItem can't
safely be fixed to check for this and call the __getitem__ slot.

Since there are many sensible uses of dictionary subclasses that don't
override __getitem__, I find it would be a shame to change
PyDict_Check() to only accept "real" dictionaries (not subclasses) --
this would disallow using dictionary subclasses for many interesting
situations.

> Specifically, I think the following functions are problematic:
> - PyList_GetItem, PyList_GET_ITEM,
> - PyDict_GetItem, PyDict_GetItemString
> 
> Any comments appreciated,

I believe that these APIs are still useful for more limited
situations.  E.g. if I write C code to implement some algorithm using
a dictionary, if I create the dictionary myself, and don't pass it on
to outside code, I can trust that it won't be mutated, so my use of
PyDict_GetItem is safe.

Another situation where PyDict_GetItem is unique: it doesn't raise an
exception when the item is not present.  This often saves a lot of
overhead in situations where a missing item simply means to try
something else, rather than a failure of the algorithm.  I think that
we may need an API with this property, even if it returns a new
reference when successful.

--Guido van Rossum (home page: http://www.python.org/~guido/)