
Hi, reading the description of the new LRU cache in the "What's new in 3.2" document now, I got the impression that the hits/misses attributes and the .clear() method aren't really well namespaced. When I read get_phone_number.clear() it's not very obvious to me what happens, unless I know that there actually *is* a cache involved, which simply has the same name as the function. So this will likely encourage users to add a half-way redundant comment like "clear the cache" to their code. What about adding an intermediate namespace called "cache", so that the new operations are available like this: print get_phone_number.cache.hits get_phone_number.cache.clear() ? It's just a little more overhead, but I think it reads quite a bit better. Stefan

Am 04.09.2010 12:06, schrieb Antoine Pitrou:
+1. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Sat, Sep 4, 2010 at 3:28 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
What about adding an intermediate namespace called "cache", so that the new operations are available like this:
I had been thinking that the lru_cache should be a class (with a dict-like interface), so it can be used explicitly and not just as a decorator. It could provide a wrap() method to be used as a decorator (or implement __call__ to keep the current semantics, but explicit is better than implicit) widget_cache = lru_cache() widget_cache[name] = widget @lru_cache().wrap def get_thingy(name): return something(name) # get_thingy.cache is an lru_cache instance print(get_thingy.cache.hits) I have been using a similar LRU cache class to store items retrieved from a database. In my case, a decorator-paradigm wouldn't work well because I only want to cache a few of the columns from a much larger query, plus there are multiple functions that want to talk to the cache. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>

On Sep 4, 2010, at 11:20 AM, Antoine Pitrou wrote:
Well, perhaps lru_cache() would have deserved a review before committing?
Not everything needs to be designed by committee. This API is based on one that was published as a recipe several years ago and has been used in a number of companies. Its design reflects feedback from a variety of advanced python users (the keyword argument support from Miki Tebeka, the concurrency support from Jim Baker, the clearing option and introspectability from Nick Coghlan, etc). Aside from the compact API, the actual implementation is so dirt simple that it will be trivial for folks to roll their own variants if they have more exotic needs. After I'm done with other work for the alpha, I'll take a further look at the suggestions here. Raymond

I see now that my previous reply went only to Stefan, so I'm re-submitting, this time to the list.
I agree. While the function-based implementation is highly efficient, the pure use of functions has the counter-Pythonic effect of obfuscating the internal state (the same way the 'private' keyword does in Java). A class-based implementation would be capable of having its state introspected and could easily be extended. While the functional implementation is a powerful construct, it fails to generalize well. IMHO, a stdlib implementation should err on the side of transparency and extensibility over performance. That said, I've adapted Hettinger's Python 2.5 implementation to a class-based implementation. I've tried to keep the performance optimizations in place, but instead of instrumenting the wrapped method with lots of cache_* functions, I simply attach the cache object itself, which then provides the interface suggested by Stefan. This technique allows access to the cache object and all of its internal state, so it's also possible to do things like: get_phone_number.cache.maxsize += 100 or if get_phone_number.cache.store: do_something_interesting() These techniques are nearly impossible in the functional implementation, as the state is buried in the locals() of the nested functions. I'm most grateful to Raymond for contributing this to Python; On many occasions, I've used the ActiveState recipes for simple caches, but in almost every case, I've had to adapt the implementation to provide more transparency. I'd prefer to not have to do the same with the stdlib. Regards, Jason R. Coombs

Am 04.09.2010 12:06, schrieb Antoine Pitrou:
+1. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Sat, Sep 4, 2010 at 3:28 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
What about adding an intermediate namespace called "cache", so that the new operations are available like this:
I had been thinking that the lru_cache should be a class (with a dict-like interface), so it can be used explicitly and not just as a decorator. It could provide a wrap() method to be used as a decorator (or implement __call__ to keep the current semantics, but explicit is better than implicit) widget_cache = lru_cache() widget_cache[name] = widget @lru_cache().wrap def get_thingy(name): return something(name) # get_thingy.cache is an lru_cache instance print(get_thingy.cache.hits) I have been using a similar LRU cache class to store items retrieved from a database. In my case, a decorator-paradigm wouldn't work well because I only want to cache a few of the columns from a much larger query, plus there are multiple functions that want to talk to the cache. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>

On Sep 4, 2010, at 11:20 AM, Antoine Pitrou wrote:
Well, perhaps lru_cache() would have deserved a review before committing?
Not everything needs to be designed by committee. This API is based on one that was published as a recipe several years ago and has been used in a number of companies. Its design reflects feedback from a variety of advanced python users (the keyword argument support from Miki Tebeka, the concurrency support from Jim Baker, the clearing option and introspectability from Nick Coghlan, etc). Aside from the compact API, the actual implementation is so dirt simple that it will be trivial for folks to roll their own variants if they have more exotic needs. After I'm done with other work for the alpha, I'll take a further look at the suggestions here. Raymond

I see now that my previous reply went only to Stefan, so I'm re-submitting, this time to the list.
I agree. While the function-based implementation is highly efficient, the pure use of functions has the counter-Pythonic effect of obfuscating the internal state (the same way the 'private' keyword does in Java). A class-based implementation would be capable of having its state introspected and could easily be extended. While the functional implementation is a powerful construct, it fails to generalize well. IMHO, a stdlib implementation should err on the side of transparency and extensibility over performance. That said, I've adapted Hettinger's Python 2.5 implementation to a class-based implementation. I've tried to keep the performance optimizations in place, but instead of instrumenting the wrapped method with lots of cache_* functions, I simply attach the cache object itself, which then provides the interface suggested by Stefan. This technique allows access to the cache object and all of its internal state, so it's also possible to do things like: get_phone_number.cache.maxsize += 100 or if get_phone_number.cache.store: do_something_interesting() These techniques are nearly impossible in the functional implementation, as the state is buried in the locals() of the nested functions. I'm most grateful to Raymond for contributing this to Python; On many occasions, I've used the ActiveState recipes for simple caches, but in almost every case, I've had to adapt the implementation to provide more transparency. I'd prefer to not have to do the same with the stdlib. Regards, Jason R. Coombs
participants (7)
-
Antoine Pitrou
-
Daniel Stutzbach
-
Georg Brandl
-
Jason R. Coombs
-
Raymond Hettinger
-
Stefan Behnel
-
Éric Araujo