[New-bugs-announce] [issue43468] functools.cached_property locking is plain wrong.

Antti Haapala report at bugs.python.org
Thu Mar 11 01:37:47 EST 2021


New submission from Antti Haapala <antti at haapala.name>:

The locking on functools.cached_property (https://github.com/python/cpython/blob/87f649a409da9d99682e78a55a83fc43225a8729/Lib/functools.py#L934) as it was written is completely undesirable for I/O bound values, parallel processing. Instead of protecting the calculation of cached property to the same instance in two threads, it completely blocks parallel calculations of cached values to *distinct instances* of the same class. 

Here's the code of __get__ in cached_property:

    def __get__(self, instance, owner=None):
        if instance is None:
            return self
        if self.attrname is None:
            raise TypeError(
                "Cannot use cached_property instance without calling __set_name__ on it.")
        try:
            cache = instance.__dict__
        except AttributeError:  # not all objects have __dict__ (e.g. class defines slots)
            msg = (
                f"No '__dict__' attribute on {type(instance).__name__!r} "
                f"instance to cache {self.attrname!r} property."
            )
            raise TypeError(msg) from None
        val = cache.get(self.attrname, _NOT_FOUND)
        if val is _NOT_FOUND:
            with self.lock:
                # check if another thread filled cache while we awaited lock
                val = cache.get(self.attrname, _NOT_FOUND)
                if val is _NOT_FOUND:
                    val = self.func(instance)
                    try:
                        cache[self.attrname] = val
                    except TypeError:
                        msg = (
                            f"The '__dict__' attribute on {type(instance).__name__!r} instance "
                            f"does not support item assignment for caching {self.attrname!r} property."
                        )
                        raise TypeError(msg) from None
        return val


I noticed this because I was recommending that Pyramid web framework deprecate its much simpler [`reify`](https://docs.pylonsproject.org/projects/pyramid/en/latest/_modules/pyramid/decorator.html#reify) decorator in favour of using `cached_property`, and then noticed why it won't do.


Here is the test case for cached_property:

from functools import cached_property
from threading import Thread
from random import randint
import time



class Spam:
    @cached_property
    def ham(self):
        print(f'Calculating amount of ham in {self}')
        time.sleep(10)
        return randint(0, 100)


def bacon():
    spam = Spam()
    print(f'The amount of ham in {spam} is {spam.ham}')


start = time.time()
threads = []
for _ in range(3):
    t = Thread(target=bacon)
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print(f'Total running time was {time.time() - start}')


Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa220>
The amount of ham in <__main__.Spam object at 0x7fa50bcaa220> is 97
Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa4f0>
The amount of ham in <__main__.Spam object at 0x7fa50bcaa4f0> is 8
Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa7c0>
The amount of ham in <__main__.Spam object at 0x7fa50bcaa7c0> is 53
Total running time was 30.02147102355957


The runtime is 30 seconds; for `pyramid.decorator.reify` the runtime would be 10 seconds:

Calculating amount of ham in <__main__.Spam object at 0x7fc4d8272430>
Calculating amount of ham in <__main__.Spam object at 0x7fc4d82726d0>
Calculating amount of ham in <__main__.Spam object at 0x7fc4d8272970>
The amount of ham in <__main__.Spam object at 0x7fc4d82726d0> is 94
The amount of ham in <__main__.Spam object at 0x7fc4d8272970> is 29
The amount of ham in <__main__.Spam object at 0x7fc4d8272430> is 93
Total running time was 10.010624170303345

`reify` in Pyramid is used heavily to add properties to incoming HTTP request objects - using `functools.cached_property` instead would mean that each independent request thread blocks others because most of them would always get the value for the same lazy property using the the same descriptor instance and locking the same lock.

----------
components: Library (Lib)
messages: 388480
nosy: ztane
priority: normal
severity: normal
status: open
title: functools.cached_property locking is plain wrong.
type: resource usage
versions: Python 3.10, Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue43468>
_______________________________________


More information about the New-bugs-announce mailing list