Fwd: Using functools.lru_cache only on some arguments of a function
Sorry for the late reply everyone. I think relying on closures, while a solution, is messy. I'd still much prefer a way to tell lru_cache to merely ignore certain arguments. I'll use some variant of the storing-the-partial-progress-as-an-attribute-on-the-cached-recursive-function method (either with the cached-recursive hidden from the top level via the try/except stuff, or with a more simple wrapper function). I've further considered my original proposal, and rather than naming it "arg_filter", I realized that builtins like sorted(), min(), max(), etc all already have the exact same thing -- a "key" argument which transforms the elements to the user's purpose. (In the sorted/min/max case, it's called on the elements of the argument rather than the argument itself, but it's still the same concept.) So basically, my original proposal with renaming from arg_filter to key, is tantamount to extending the same functionality from sorted/min/max to lru_cache as well. As has been pointed out, my own use case is almost certainly *not* the only use case. The implementation and interface are both simple, and simpler than the alternatives which I'll rely on for now (wrappers and closures, or worse, global singletons etc). I would still like to see it in the stdlib in the future. I've appended a largely similar patch with the proposed additions (there's some internal variable renaming to avoid confusion, resulting in a longer diff). Thanks again for all the input. -Bill ----------------------------------------------------------------------------------------------------------------- https://hg.python.org/cpython/file/3.5/Lib/functools.py diff functools.py.orig functools.py 363c363 < def _make_key(args, kwds, typed, ---
def _make_key(args, kwds, typed, key, 377c377,379 < key = args
if key is not None: args, kwds = key(args, kwds) cache_key = args
380c382 < key += kwd_mark ---
cache_key += kwd_mark
382c384 < key += item ---
cache_key += item
384c386 < key += tuple(type(v) for v in args) ---
cache_key += tuple(type(v) for v in args)
386,389c388,391 < key += tuple(type(v) for k, v in sorted_items) < elif len(key) == 1 and type(key[0]) in fasttypes: < return key[0] < return _HashedSeq(key) ---
cache_key += tuple(type(v) for k, v in sorted_items) elif len(cache_key) == 1 and type(cache_key[0]) in fasttypes: return cache_key[0] return _HashedSeq(cache_key)
391c393 < def lru_cache(maxsize=128, typed=False): ---
def lru_cache(maxsize=128, typed=False, key=None): 400a403,407 If *key* is not None, it must be a callable which acts on the arguments passed to the function. Its return value is used in place of the actual arguments. It works analogously to the *key* argument to the builtins sorted, max, and min.
421a429,431
if key is not None and not callable(key): raise TypeErrpr('Expected key to be a callable')
423c433,434 < wrapper = _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo) ---
wrapper = _lru_cache_wrapper(user_function, maxsize, typed, key, _CacheInfo)
428c439 < def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo): ---
def _lru_cache_wrapper(user_function, maxsize, typed, key, _CacheInfo): 456,457c467,468 < key = make_key(args, kwds, typed) < result = cache_get(key, sentinel)
cache_key = make_key(args, kwds, typed, key) result = cache_get(cache_key, sentinel)
462c473 < cache[key] = result ---
cache[cache_key] = result
471c482 < key = make_key(args, kwds, typed) ---
cache_key = make_key(args, kwds, typed, key)
473c484 < link = cache_get(key) ---
link = cache_get(cache_key)
487c498 < if key in cache: ---
if cache_key in cache:
496c507 < oldroot[KEY] = key ---
oldroot[KEY] = cache_key
513c524 < cache[key] = oldroot ---
cache[cache_key] = oldroot
517,518c528,529 < link = [last, root, key, result] < last[NEXT] = root[PREV] = cache[key] = link ---
link = [last, root, cache_key, result] last[NEXT] = root[PREV] = cache[cache_key] = link
On Wed, Dec 30, 2015 at 11:10 PM, Michael Selik
On Tue, Dec 29, 2015 at 2:14 AM Franklin? Lee < leewangzhong+python@gmail.com> wrote:
On Sat, Dec 12, 2015 at 1:34 PM, Michael Selik
wrote: On Fri, Dec 11, 2015, 8:20 PM Franklin? Lee < leewangzhong+python@gmail.com> wrote:
rather than using a closure, depending on how intertwined the code
This whole thing is probably best implemented as two separate functions paths are
for the shortcut/non-shortcut versions.
I like the closure because it has semantic ownership: the inner function is a worker for the outer function.
True, a closure has better encapsulation, making it less likely someone will misuse the helper function. On the other hand, that means there's less modularity and it would be difficult for someone to use the inner function. It's hard to know the right choice without seeing the exact problem the original author was working on.
On Fri, Dec 11, 2015 at 8:01 PM, Franklin? Lee
wrote: 1. Rewrite your recursive function so that the partial state is a nonlocal variable (in the closure), and memoize the recursive part.
I'd flip the rare-case to the except block and put the normal-case in the try block. I believe this will be more compute-efficient and more readable.
The rare case is in the except block, though.
You're correct. Sorry, I somehow misinterpreted the comment, "# To trigger the exception the first time" as indicating that code path would run only once.
Shouldn't the key function be called with ``key(*args, **kwargs)``?
It'd be helpful to see the entire revision, rather than just the diff. It's
easier for me to read at least.
On Sun, Jan 10, 2016, 3:27 AM Bill Winslow
Sorry for the late reply everyone.
I think relying on closures, while a solution, is messy. I'd still much prefer a way to tell lru_cache to merely ignore certain arguments. I'll use some variant of the storing-the-partial-progress-as-an-attribute-on-the-cached-recursive-function method (either with the cached-recursive hidden from the top level via the try/except stuff, or with a more simple wrapper function).
I've further considered my original proposal, and rather than naming it "arg_filter", I realized that builtins like sorted(), min(), max(), etc all already have the exact same thing -- a "key" argument which transforms the elements to the user's purpose. (In the sorted/min/max case, it's called on the elements of the argument rather than the argument itself, but it's still the same concept.) So basically, my original proposal with renaming from arg_filter to key, is tantamount to extending the same functionality from sorted/min/max to lru_cache as well. As has been pointed out, my own use case is almost certainly *not* the only use case. The implementation and interface are both simple, and simpler than the alternatives which I'll rely on for now (wrappers and closures, or worse, global singletons etc). I would still like to see it in the stdlib in the future. I've appended a largely similar patch with the proposed additions (there's some internal variable renaming to avoid confusion, resulting in a longer diff).
Thanks again for all the input.
-Bill
----------------------------------------------------------------------------------------------------------------- https://hg.python.org/cpython/file/3.5/Lib/functools.py
diff functools.py.orig functools.py 363c363 < def _make_key(args, kwds, typed, ---
def _make_key(args, kwds, typed, key, 377c377,379 < key = args
if key is not None: args, kwds = key(args, kwds) cache_key = args
380c382 < key += kwd_mark ---
cache_key += kwd_mark
382c384 < key += item ---
cache_key += item
384c386 < key += tuple(type(v) for v in args) ---
cache_key += tuple(type(v) for v in args)
386,389c388,391 < key += tuple(type(v) for k, v in sorted_items) < elif len(key) == 1 and type(key[0]) in fasttypes: < return key[0] < return _HashedSeq(key) ---
cache_key += tuple(type(v) for k, v in sorted_items) elif len(cache_key) == 1 and type(cache_key[0]) in fasttypes: return cache_key[0] return _HashedSeq(cache_key)
391c393 < def lru_cache(maxsize=128, typed=False): ---
def lru_cache(maxsize=128, typed=False, key=None): 400a403,407 If *key* is not None, it must be a callable which acts on the arguments passed to the function. Its return value is used in place of the actual arguments. It works analogously to the *key* argument to the builtins sorted, max, and min.
421a429,431
if key is not None and not callable(key): raise TypeErrpr('Expected key to be a callable')
423c433,434 < wrapper = _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo) ---
wrapper = _lru_cache_wrapper(user_function, maxsize, typed, key, _CacheInfo)
428c439 < def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo): ---
def _lru_cache_wrapper(user_function, maxsize, typed, key, _CacheInfo): 456,457c467,468 < key = make_key(args, kwds, typed) < result = cache_get(key, sentinel)
cache_key = make_key(args, kwds, typed, key) result = cache_get(cache_key, sentinel)
462c473 < cache[key] = result ---
cache[cache_key] = result
471c482 < key = make_key(args, kwds, typed) ---
cache_key = make_key(args, kwds, typed, key)
473c484 < link = cache_get(key) ---
link = cache_get(cache_key)
487c498 < if key in cache: ---
if cache_key in cache:
496c507 < oldroot[KEY] = key ---
oldroot[KEY] = cache_key
513c524 < cache[key] = oldroot ---
cache[cache_key] = oldroot
517,518c528,529 < link = [last, root, key, result] < last[NEXT] = root[PREV] = cache[key] = link ---
link = [last, root, cache_key, result] last[NEXT] = root[PREV] = cache[cache_key] = link
On Wed, Dec 30, 2015 at 11:10 PM, Michael Selik
wrote: On Tue, Dec 29, 2015 at 2:14 AM Franklin? Lee < leewangzhong+python@gmail.com> wrote:
On Sat, Dec 12, 2015 at 1:34 PM, Michael Selik
wrote: On Fri, Dec 11, 2015, 8:20 PM Franklin? Lee < leewangzhong+python@gmail.com> wrote:
rather than using a closure, depending on how intertwined the code
This whole thing is probably best implemented as two separate functions paths are
for the shortcut/non-shortcut versions.
I like the closure because it has semantic ownership: the inner function is a worker for the outer function.
True, a closure has better encapsulation, making it less likely someone will misuse the helper function. On the other hand, that means there's less modularity and it would be difficult for someone to use the inner function. It's hard to know the right choice without seeing the exact problem the original author was working on.
On Fri, Dec 11, 2015 at 8:01 PM, Franklin? Lee
wrote: 1. Rewrite your recursive function so that the partial state is a nonlocal variable (in the closure), and memoize the recursive part.
I'd flip the rare-case to the except block and put the normal-case in the try block. I believe this will be more compute-efficient and more readable.
The rare case is in the except block, though.
You're correct. Sorry, I somehow misinterpreted the comment, "# To trigger the exception the first time" as indicating that code path would run only once.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Jan 10, 2016 at 3:27 AM, Bill Winslow
Sorry for the late reply everyone.
I think relying on closures, while a solution, is messy. I'd still much prefer a way to tell lru_cache to merely ignore certain arguments.
Wait, why is it messy? The function is created inside the outer function, and never gets released to the outside. I think it's cleaner, because it's encapsulating the recursive part, the memo, and the cached work. Besides, `lru_cache` is implemented using a closure, and your solution of passing a key function might be used with a closure on a nested function. If you're solving dynamic programming puzzles, an outer/inner pairing of non-recursive/recursive represents the fact that your memoization can't be reused for different instances of the same problem. (See, for example, Edit Distance (https://web.stanford.edu/class/cs124/lec/med.pdf), in which your recursive parameters are indices into your non-recursive parameters.)
I've further considered my original proposal, and rather than naming it "arg_filter", I realized that builtins like sorted(), min(), max(), etc all already have the exact same thing -- a "key" argument which transforms the elements to the user's purpose. (In the sorted/min/max case, it's called on the elements of the argument rather than the argument itself, but it's still the same concept.) So basically, my original proposal with renaming from arg_filter to key, is tantamount to extending the same functionality from sorted/min/max to lru_cache as well.
I like this conceptually, because the `key` parameter sort of lets you
customize the cache dict (or whatever). You can use, for example,
`str.lower` (though not directly).
Note that the key parameter in those other functions allows you to
call the key function only once per element, which is impossible for
this.
Should it be possible to specify a tuple for `key` to transform each
arg separately? In your case, you might pass in `(None, lambda x: 0)`
to specify that the first parameter shouldn't be transformed, and the
second parameter should be considered constant. But that's very
confusing: should `None` mean "ignore", or "don't transform" (like
`filter`)? Or we can use `False` for "ignore", perhaps.
On Sun, Jan 10, 2016 at 2:03 PM, Michael Selik
Shouldn't the key function be called with ``key(*args, **kwargs)``?
Does `lru_cache` know how to deal with passing regular args as kwargs? Does it
On Tue, Jan 12, 2016 at 6:26 PM Franklin? Lee
On Sun, Jan 10, 2016 at 3:27 AM, Bill Winslow
wrote: Sorry for the late reply everyone.
I think relying on closures, while a solution, is messy. I'd still much prefer a way to tell lru_cache to merely ignore certain arguments.
Wait, why is it messy? The function is created inside the outer function, and never gets released to the outside. I think it's cleaner, because it's encapsulating the recursive part, the memo, and the cached work. Besides, `lru_cache` is implemented using a closure, and your solution of passing a key function might be used with a closure on a nested function.
If you're solving dynamic programming puzzles, an outer/inner pairing of non-recursive/recursive represents the fact that your memoization can't be reused for different instances of the same problem. (See, for example, Edit Distance (https://web.stanford.edu/class/cs124/lec/med.pdf), in which your recursive parameters are indices into your non-recursive parameters.)
I've further considered my original proposal, and rather than naming it "arg_filter", I realized that builtins like sorted(), min(), max(), etc all already have the exact same thing -- a "key" argument which transforms the elements to the user's purpose. (In the sorted/min/max case, it's called on the elements of the argument rather than the argument itself, but it's still the same concept.) So basically, my original proposal with renaming from arg_filter to key, is tantamount to extending the same functionality from sorted/min/max to lru_cache as well.
I like this conceptually, because the `key` parameter sort of lets you customize the cache dict (or whatever). You can use, for example, `str.lower` (though not directly).
Note that the key parameter in those other functions allows you to call the key function only once per element, which is impossible for this.
Should it be possible to specify a tuple for `key` to transform each arg separately? In your case, you might pass in `(None, lambda x: 0)` to specify that the first parameter shouldn't be transformed, and the second parameter should be considered constant. But that's very confusing: should `None` mean "ignore", or "don't transform" (like `filter`)? Or we can use `False` for "ignore", perhaps.
I think his intention was to mimic the ``key`` argument of sorted, which
expects a function that takes 1 and only 1 positional argument.
Perhaps it's best to see the exact use case and a few other examples, to
get a better idea for the specifics, before implementing this feature?
On Sun, Jan 10, 2016 at 2:03 PM, Michael Selik
Shouldn't the key function be called with ``key(*args, **kwargs)``?
Does `lru_cache` know how to deal with passing regular args as kwargs?
Now that you mention it, I realized it treats the two differently. ``def foo(x): pass`` would store ``foo(42)`` and ``foo(x=42)`` as different entries in the cache.
On Tue, Jan 12, 2016 at 9:13 PM, Michael Selik
On Tue, Jan 12, 2016 at 6:26 PM Franklin? Lee
wrote: Should it be possible to specify a tuple for `key` to transform each arg separately? In your case, you might pass in `(None, lambda x: 0)` to specify that the first parameter shouldn't be transformed, and the second parameter should be considered constant. But that's very confusing: should `None` mean "ignore", or "don't transform" (like `filter`)? Or we can use `False` for "ignore", perhaps.
I think his intention was to mimic the ``key`` argument of sorted, which expects a function that takes 1 and only 1 positional argument.
I know, but those three functions expect a sequence of single things, while `lru_cache` expects several things. (Not exactly a tuple, because that would be a single thing, but rather a collection of things.)
Perhaps it's best to see the exact use case and a few other examples, to get a better idea for the specifics, before implementing this feature?
On Sun, Jan 10, 2016 at 2:03 PM, Michael Selik
wrote: Shouldn't the key function be called with ``key(*args, **kwargs)``?
Does `lru_cache` know how to deal with passing regular args as kwargs?
Now that you mention it, I realized it treats the two differently. ``def foo(x): pass`` would store ``foo(42)`` and ``foo(x=42)`` as different entries in the cache.
I feel like, at least ideally, there should be a way for `update_wrapper`/`wraps` to unpack named arguments, so that wrappers can truly reflect the params of the functions they wrap. (For example, when inspecting a function, and by decoding kw-or-positional-args to their place in `*args`.) It should also be possible to add or remove args, though I'm not sure how useful that will be. (Also ideally, a wrapper function would "pass up" its default args to the wrapper.)
participants (3)
-
Bill Winslow
-
Franklin? Lee
-
Michael Selik