New subject: Adding a "once" function to functools

April 29, 2020

      ...
Thread 2 wakes up with the lock, calls the function, fills the cache, and releases the lock.
What exactly would the issue be with this:

```
import functools
from threading import Lock

def once(func):
    sentinel = object()
    cache = sentinel
    lock = Lock()

    @functools.wraps(func)
    def _wrapper():
        nonlocal cache, lock, sentinel
        if cache is sentinel:
            with lock:
                if cache is sentinel:
                    cache = func()
        return cache

    return _wrapper
```

It ensures it’s invoked once, it’s fast and it doesn’t require acquiring a lock needlessly. Always wrapping a lock around a static value to account for a small edge case seems redundant, more than “it’s usually not optimal”. It’s never optimal: in multi-threaded cases you’re causing needless contention, and in single threaded cases all your calls are now slower.

Seems generally more correct, even in single threaded cases, to pay the overhead only in the first call if you want `call_once` semantics. Which is why you would be using `call_once` in the first place?
...
On 29 Apr 2020, at 18:51, Andrew Barnert <abarnert@yahoo.com> wrote:
On Apr 29, 2020, at 00:34, Tom Forbes <tom@tomforb.es> wrote:
...
It’s not quite that easy, either you needlessly lock all calls or you end up invoking it twice.
What you want is to acquire a lock if the cache is empty, check if another thread has filled the cache while you where waiting on the lock, call the function, fill the cache and return.
Thread 1 sees the cache is empty, so it acquires the lock, runs the function.
Thread 2 sees the cache is empty, so it tries to acquire the cache and blocks.
Thread 1 finishes the function, fills the cache, and releases the lock.
Thread 2 wakes up with the lock, calls the function, fills the cache, and releases the lock.
You’ve now run the function twice instead of once, and the first caller got a different value than the one every subsequent caller can get.
There are actually plenty of cases where this is acceptable (the function is safe, and idempotent, and while it’s slow it’s not so slow that occasionally running it twice is worse than locking a zillion times)—but in those cases, just not using a lock is even better. In fact, if you can compare-and-swap at the end (which requires a lock to simulate in Python, but it’s a much more granular one than locking over the whole expensive function call), you can even use it in more cases (in particular, where the function isn’t idempotent but you need to value to be).
This is exactly why users should be encouraged to just lock the whole thing. It’s always correct. It’s usually not optimal, but often good enough. The one really common and simple case where it’s not good enough is single-threaded code. Anything else where you need to optimize, you really do need to understand the pattern you’re trying to optimize for (and the environment(s) you’re running under) and write your own function.

Re: Adding a "once" function to functools

Tom Forbes

Raymond Hettinger

Tom Forbes

Andrew Barnert

Tom Forbes

Andrew Barnert

Antoine Pitrou

Tom Forbes

Raymond Hettinger

Tom Forbes

Andrew Barnert

Tom Forbes

Andrew Barnert

Antoine Pitrou

Tom Forbes

tags

participants (4)