12.01.21 18:57, Steven D'Aprano пише:
Can you explain further why the cached function needs additional syncronisation overhead?
The cache uses a double-linked list to track what item is the oldest and changes it every time you call the cached function (move the found or new item to the beginning of the list). Code that changes links of the list is critical. If it is interrupted, we can get a crash or infinite loop in C. It is guarded by GIL, now Python code is called when changes are made. Now, if we iterate the list and save items, we can call Python code when save items (especially if we save them into a dict). It can change the list. In the best case it can lead to skipped or duplicated items and random runtime errors when we use the cached function in other thread during inspecting the cache. It is not much worse than iterating a modifying dict. In worst case we can get crashes, perhaps in different places of code. This code should be written very accurately. There were three iterations by three authors of writing the C implementation of the lru_cache, and some bugs were founds several months after that. One of ways to do it safely is to add explicit locks in addition to GIL. It is not the easiest, nor the safest, and of course not the most efficient way, but it is the most obvious one.
- If you export the cache from one thread while another thread is reading the cache, I expect that would be safe.
Reading the cache always modifies it.
I was having trouble with the function, and couldn't tell if the right arguments where going into the cache. What I wanted to do was peek at the cache and see which keys were ending up in the cache and compare that to what I expected.
In your case it would perhaps easier to write your own implementation of the cache, or disable the C implementation and use the Python implementation of lru_cache(), which allows some introspection. In any case, it was one-time problem, and it is already solved. If it occurred more than one time, it would make sense to think about including such feature in the stdlib. But the cost of it may be larger than you expect.