Is the implementation of lru_cache too opaque to poke into it without an existing method? Or write a quick monkey-patch? 

Sorry for not checking myself, but the ability to do that kind of thing is one of the great things about a dynamic open source language.

-CHB

On Tue, Jan 12, 2021 at 9:04 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Jan 12, 2021 at 04:32:14PM +0200, Serhiy Storchaka wrote:
> 12.01.21 12:02, Steven D'Aprano пише:

> > I propose a method:
> >
> >     @functools.lru_cache()
> >     def function(arg):
> >         ...
> >
> >     function.cache_export()
> >
> > that returns a dictionary {arg: value} representing the cache. It
> > wouldn't be the cache itself, just a shallow copy of the cache data.
>
> What if the function supports multiple arguments (including passed by
> keyword)? Note that internal representation of the key is an
> implementation detail, so you need to invent and specify some new
> representation. For example return a list of tuples (args, kwargs, result).

Sure. That's a detail that can be worked out once we agree that this is
a useful feature.


> Depending on the implementation, getting the list of all arguments can
> have larger that linear complexity.

I don't mind. Efficiency is not a priority for this. This is an
introspection feature for development and debugging, not a feature for
production. I don't expect it to be called in tight loops. I expect to
use it from the REPL while I am debugging my code.

I might have to rethink if it was exponentionally slow, but O(n log n)
like sorting would be fine; I'd even consider O(n**2) acceptable, with a
documentation note that exporting large caches may be slow.


> Other cache implementations can contain additional information: the
> number of hits for every value, times. Are you interesting to get that
> information too or ignore it?

No.


> Currently the cache is thread-safe in CPython, but getting all arguments
> and values may not be (or we will need to add a synchronization overhead
> for every call of the cached function).

Can you explain further why the cached function needs additional
syncronisation overhead?

I am quite happy for exporting to be thread-unsafe, so long as it
doesn't crash. Don't export the cache if it is being updated from
another thread, or you might get inaccurate results.

To be clear:

- If you export the cache from one thread while another thread is
  reading the cache, I expect that would be safe.

- If you export the cache from one thread while another thread is
  *modifying* the cache, I expect that the only promise we make is
  that there shouldn't be a segfault.



> And finally, what is your use case? Is it important enough to justify
> the cost?

I think so or I wouldn't have asked :-)

There shouldn't be much (or any?) runtime cost on the cache except for
the presence of an additional method. The exported data is just a
snapshot, it doesn't have to be a view of the cache. Changing the
exported snapshot will not change the cache.

My use-case is debugging functions that are using an LRU cache,
specifically complex recursive functions. I have some functions where:

    f(N)

ends up calling itself many times, but not in any obvious pattern:

    f(N-1), f(N-2), f(N-5), f(N-7), f(N-12), f(N-15), f(N-22), ...

for example. So each call to f() could make dozens of recursive calls,
if N is big enough, and there are gaps in the calls.

I was having trouble with the function, and couldn't tell if the right
arguments where going into the cache. What I wanted to do was peek at
the cache and see which keys were ending up in the cache and compare
that to what I expected.

I did end up get the function working, but I think it would have been
much easier if I could have seen what was inside the cache and how the
cache was changing from one call to the next.

So this is why I don't care about performance (within reason). My use
case is interactive debugging.


--
Steve
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/EV2W2DMXSONPHUYXGQD5HK3BIUTFIEVU/
Code of Conduct: http://python.org/psf/codeofconduct/
--
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython