[Twisted-Python] Waiting for a contended resource
Hi folks, I thought I'd poll the list on the best way to approach a problem in Twisted. The background is that we have a number of resources which can be requested by a REST client, and which are calculated on demand. The calculation is moderately expensive (can take multiple seconds), so the results of the calculation are cached so multiple lookups of the same resource are more efficient. The problem comes in trying to handle multiple clients requesting the same resource at once. Obviously if 200 clients all request the same resource at the same time, we don't want to fire off 200 calculation requests. The approach we adopted was, effectively, to maintain a lock for each resource:
lock = defer.DeferredLock() cached_result = None
@defer.inlineCallbacks def getResource(): yield lock.acquire() try: if cached_result is None: cached_result = yield do_expensive_calculation() defer.returnValue(cached_result) finally: lock.release()
(Of course one can optimise the above to avoid getting the lock if we already have the cached result - I've omitted that for simplicity.) That's all very well, but it falls down when we get more than about 200 requests for the same resource: once the calculation completes, we can suddenly serve all the requests, and the Deferreds returned by DeferredLock end up chaining together in a way that overflows the stack. I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the time, worked around it by adding a call to reactor.callLater(0) into our implementation. However, Jean-Paul's comments on that bug implied that we were approaching the problem in completely the wrong way, and instead we should be avoiding queuing up work like this in the first place. It's worth reiterating that the requests arrive from REST clients which we have no direct control over. We *could* keep track of the number of waiting clients, and make the API respond with a 5xx error or similar if that number gets too high, with the expectation that the client retries - but one concern would be that the load from the additional HTTP traffic would outweigh any efficiency gained by not stacking up Deferreds. So, I'd welcome any advice on better ways to approach the problem. Richard
Hi Richard, On March 12, 2018 at 1:49:41 PM, Richard van der Hoff (richard@matrix.org) wrote: Hi folks, I thought I'd poll the list on the best way to approach a problem in Twisted. The background is that we have a number of resources which can be requested by a REST client, and which are calculated on demand. The calculation is moderately expensive (can take multiple seconds), so the results of the calculation are cached so multiple lookups of the same resource are more efficient. The problem comes in trying to handle multiple clients requesting the same resource at once. Obviously if 200 clients all request the same resource at the same time, we don't want to fire off 200 calculation requests. The approach we adopted was, effectively, to maintain a lock for each resource:
lock = defer.DeferredLock() cached_result = None
@defer.inlineCallbacks def getResource(): yield lock.acquire() try: if cached_result is None: cached_result = yield do_expensive_calculation() defer.returnValue(cached_result) finally: lock.release()
(Of course one can optimise the above to avoid getting the lock if we already have the cached result - I've omitted that for simplicity.) That's all very well, but it falls down when we get more than about 200 requests for the same resource: once the calculation completes, we can suddenly serve all the requests, and the Deferreds returned by DeferredLock end up chaining together in a way that overflows the stack. I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the time, worked around it by adding a call to reactor.callLater(0) into our implementation. However, Jean-Paul's comments on that bug implied that we were approaching the problem in completely the wrong way, and instead we should be avoiding queuing up work like this in the first place. You mention using callLater to solve this problem, so I’m guessing that instead of using a lock you are re-scheduling the call to getResource if there is no cached_result value. I’ve used this solution plenty of times across multiple projects, and have found it both simple and reliable. Is there some reason why this solution is not desirable in your case? It's worth reiterating that the requests arrive from REST clients which we have no direct control over. We *could* keep track of the number of waiting clients, and make the API respond with a 5xx error or similar if that number gets too high, with the expectation that the client retries - but one concern would be that the load from the additional HTTP traffic would outweigh any efficiency gained by not stacking up Deferreds. Have you validated this concern through load-testing? You may find that there is no meaningful negative impact to this approach. So, I'd welcome any advice on better ways to approach the problem. Richard Hope this helps, L. Daniel Burr
Hi, Richard, I've used class like this to cache the result of Expensive Calculation: class DeferredCache: pending = None result = None failure = None def __init__(self, expensive_func): self.expensive_func = expensive_func def __call__(self): if self.pending is None: def on_ready(result): self.result = result def on_fail(failure): self.failure = failure self.pending = defer.maybeDeferred(self.expensive_func).addCallbacks(on_ready, on_fail) return self.pending.addCallback(self._return_result) def _return_result(self, _): return self.failure or self.result Using it you can get rid of DeferredLocks: deferred_cache = DeferredCache(do_expensive_calculation) def getResource(): return deferred_cache() It will start `expensive_func` on the first call. The second and consequtive calls will return deferreds that resolves with the result when expensive_func is done. If you call it when result is already here, it will return alread-fired deferred. Of course, it will require some more work if you need to pass arguments to `expensive_func` and memoize results per arguments values. -- ilya пн, 12 мар. 2018 г. в 22:38, L. Daniel Burr <ldanielburr@me.com>:
Hi Richard,
On March 12, 2018 at 1:49:41 PM, Richard van der Hoff (richard@matrix.org) wrote:
Hi folks,
I thought I'd poll the list on the best way to approach a problem in Twisted.
The background is that we have a number of resources which can be requested by a REST client, and which are calculated on demand. The calculation is moderately expensive (can take multiple seconds), so the results of the calculation are cached so multiple lookups of the same resource are more efficient.
The problem comes in trying to handle multiple clients requesting the same resource at once. Obviously if 200 clients all request the same resource at the same time, we don't want to fire off 200 calculation requests.
The approach we adopted was, effectively, to maintain a lock for each resource:
lock = defer.DeferredLock() cached_result = None
@defer.inlineCallbacks def getResource(): yield lock.acquire() try: if cached_result is None: cached_result = yield do_expensive_calculation() defer.returnValue(cached_result) finally: lock.release()
(Of course one can optimise the above to avoid getting the lock if we already have the cached result - I've omitted that for simplicity.)
That's all very well, but it falls down when we get more than about 200 requests for the same resource: once the calculation completes, we can suddenly serve all the requests, and the Deferreds returned by DeferredLock end up chaining together in a way that overflows the stack.
I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the time, worked around it by adding a call to reactor.callLater(0) into our implementation. However, Jean-Paul's comments on that bug implied that we were approaching the problem in completely the wrong way, and instead we should be avoiding queuing up work like this in the first place.
You mention using callLater to solve this problem, so I’m guessing that instead of using a lock you are re-scheduling the call to getResource if there is no cached_result value. I’ve used this solution plenty of times across multiple projects, and have found it both simple and reliable. Is there some reason why this solution is not desirable in your case?
It's worth reiterating that the requests arrive from REST clients which we have no direct control over. We *could* keep track of the number of waiting clients, and make the API respond with a 5xx error or similar if that number gets too high, with the expectation that the client retries - but one concern would be that the load from the additional HTTP traffic would outweigh any efficiency gained by not stacking up Deferreds.
Have you validated this concern through load-testing? You may find that there is no meaningful negative impact to this approach.
So, I'd welcome any advice on better ways to approach the problem.
Richard
Hope this helps,
L. Daniel Burr _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On Mon, Mar 12, 2018 at 3:52 PM, Ilya Skriblovsky <ilyaskriblovsky@gmail.com
wrote:
Hi, Richard,
I've used class like this to cache the result of Expensive Calculation:
class DeferredCache: pending = None result = None failure = None
def __init__(self, expensive_func): self.expensive_func = expensive_func
def __call__(self): if self.pending is None: def on_ready(result): self.result = result def on_fail(failure): self.failure = failure
self.pending = defer.maybeDeferred(self. expensive_func).addCallbacks(on_ready, on_fail)
return self.pending.addCallback(self._return_result)
This seems like basically a correct answer to me. However, I suggest one small change. You probably want to create and return a new Deferred for each result. If you don't, then your internal `pending` Deferred is now reachable by application code. As written, an application might (very, very reasonably): d = getResource() d.addCallback(long_async_operation) Now `pending` has `long_async_operation` as a callback on its chain. This will prevent anyone else from getting a result until `long_async_operation` is done. You can fix this by: result = Deferred() self.pending.addCallback(self._return_result).chainDeferred(result) return result Now the application can only reach `result`. Nothing they do to `result` will make much difference to `pending` because `chainDeferred` only puts `callback` (and `errback`) onto `pending`'s callback chain. `callback` and `errback` don't wait on anything. You have to be a little careful with `chainDeferred` because it doesn't have the recursion-avoidance logic that implicit chaining has. However, that doesn't matter in this particular case because the chain depth is fixed at two (`pending` and `result`). The problems only arise if you extend the chain out in this direction without bound. Jean-Paul
def _return_result(self, _): return self.failure or self.result
Using it you can get rid of DeferredLocks:
deferred_cache = DeferredCache(do_expensive_calculation)
def getResource(): return deferred_cache()
It will start `expensive_func` on the first call. The second and consequtive calls will return deferreds that resolves with the result when expensive_func is done. If you call it when result is already here, it will return alread-fired deferred.
Of course, it will require some more work if you need to pass arguments to `expensive_func` and memoize results per arguments values.
-- ilya
пн, 12 мар. 2018 г. в 22:38, L. Daniel Burr <ldanielburr@me.com>:
Hi Richard,
On March 12, 2018 at 1:49:41 PM, Richard van der Hoff (richard@matrix.org) wrote:
Hi folks,
I thought I'd poll the list on the best way to approach a problem in Twisted.
The background is that we have a number of resources which can be requested by a REST client, and which are calculated on demand. The calculation is moderately expensive (can take multiple seconds), so the results of the calculation are cached so multiple lookups of the same resource are more efficient.
The problem comes in trying to handle multiple clients requesting the same resource at once. Obviously if 200 clients all request the same resource at the same time, we don't want to fire off 200 calculation requests.
The approach we adopted was, effectively, to maintain a lock for each resource:
lock = defer.DeferredLock() cached_result = None
@defer.inlineCallbacks def getResource(): yield lock.acquire() try: if cached_result is None: cached_result = yield do_expensive_calculation() defer.returnValue(cached_result) finally: lock.release()
(Of course one can optimise the above to avoid getting the lock if we already have the cached result - I've omitted that for simplicity.)
That's all very well, but it falls down when we get more than about 200 requests for the same resource: once the calculation completes, we can suddenly serve all the requests, and the Deferreds returned by DeferredLock end up chaining together in a way that overflows the stack.
I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the time, worked around it by adding a call to reactor.callLater(0) into our implementation. However, Jean-Paul's comments on that bug implied that we were approaching the problem in completely the wrong way, and instead we should be avoiding queuing up work like this in the first place.
You mention using callLater to solve this problem, so I’m guessing that instead of using a lock you are re-scheduling the call to getResource if there is no cached_result value. I’ve used this solution plenty of times across multiple projects, and have found it both simple and reliable. Is there some reason why this solution is not desirable in your case?
It's worth reiterating that the requests arrive from REST clients which we have no direct control over. We *could* keep track of the number of waiting clients, and make the API respond with a 5xx error or similar if that number gets too high, with the expectation that the client retries - but one concern would be that the load from the additional HTTP traffic would outweigh any efficiency gained by not stacking up Deferreds.
Have you validated this concern through load-testing? You may find that there is no meaningful negative impact to this approach.
So, I'd welcome any advice on better ways to approach the problem.
Richard
Hope this helps,
L. Daniel Burr _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Thanks for correction, Jean-Paul, you're absolutly right пн, 12 мар. 2018 г. в 23:00, Jean-Paul Calderone <exarkun@twistedmatrix.com
:
On Mon, Mar 12, 2018 at 3:52 PM, Ilya Skriblovsky < ilyaskriblovsky@gmail.com> wrote:
Hi, Richard,
I've used class like this to cache the result of Expensive Calculation:
class DeferredCache: pending = None result = None failure = None
def __init__(self, expensive_func): self.expensive_func = expensive_func
def __call__(self): if self.pending is None: def on_ready(result): self.result = result def on_fail(failure): self.failure = failure
self.pending = defer.maybeDeferred(self.expensive_func).addCallbacks(on_ready, on_fail)
return self.pending.addCallback(self._return_result)
This seems like basically a correct answer to me. However, I suggest one small change.
You probably want to create and return a new Deferred for each result. If you don't, then your internal `pending` Deferred is now reachable by application code.
As written, an application might (very, very reasonably):
d = getResource() d.addCallback(long_async_operation)
Now `pending` has `long_async_operation` as a callback on its chain. This will prevent anyone else from getting a result until `long_async_operation` is done.
You can fix this by:
result = Deferred() self.pending.addCallback(self._return_result).chainDeferred(result) return result
Now the application can only reach `result`. Nothing they do to `result` will make much difference to `pending` because `chainDeferred` only puts `callback` (and `errback`) onto `pending`'s callback chain. `callback` and `errback` don't wait on anything.
You have to be a little careful with `chainDeferred` because it doesn't have the recursion-avoidance logic that implicit chaining has. However, that doesn't matter in this particular case because the chain depth is fixed at two (`pending` and `result`). The problems only arise if you extend the chain out in this direction without bound.
Jean-Paul
def _return_result(self, _): return self.failure or self.result
Using it you can get rid of DeferredLocks:
deferred_cache = DeferredCache(do_expensive_calculation)
def getResource(): return deferred_cache()
It will start `expensive_func` on the first call. The second and consequtive calls will return deferreds that resolves with the result when expensive_func is done. If you call it when result is already here, it will return alread-fired deferred.
Of course, it will require some more work if you need to pass arguments to `expensive_func` and memoize results per arguments values.
-- ilya
пн, 12 мар. 2018 г. в 22:38, L. Daniel Burr <ldanielburr@me.com>:
Hi Richard,
On March 12, 2018 at 1:49:41 PM, Richard van der Hoff ( richard@matrix.org) wrote:
Hi folks,
I thought I'd poll the list on the best way to approach a problem in Twisted.
The background is that we have a number of resources which can be requested by a REST client, and which are calculated on demand. The calculation is moderately expensive (can take multiple seconds), so the results of the calculation are cached so multiple lookups of the same resource are more efficient.
The problem comes in trying to handle multiple clients requesting the same resource at once. Obviously if 200 clients all request the same resource at the same time, we don't want to fire off 200 calculation requests.
The approach we adopted was, effectively, to maintain a lock for each resource:
lock = defer.DeferredLock() cached_result = None
@defer.inlineCallbacks def getResource(): yield lock.acquire() try: if cached_result is None: cached_result = yield do_expensive_calculation() defer.returnValue(cached_result) finally: lock.release()
(Of course one can optimise the above to avoid getting the lock if we already have the cached result - I've omitted that for simplicity.)
That's all very well, but it falls down when we get more than about 200 requests for the same resource: once the calculation completes, we can suddenly serve all the requests, and the Deferreds returned by DeferredLock end up chaining together in a way that overflows the stack.
I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the
time, worked around it by adding a call to reactor.callLater(0) into our
implementation. However, Jean-Paul's comments on that bug implied that we were approaching the problem in completely the wrong way, and instead
we should be avoiding queuing up work like this in the first place.
You mention using callLater to solve this problem, so I’m guessing that instead of using a lock you are re-scheduling the call to getResource if there is no cached_result value. I’ve used this solution plenty of times across multiple projects, and have found it both simple and reliable. Is there some reason why this solution is not desirable in your case?
It's worth reiterating that the requests arrive from REST clients which we have no direct control over. We *could* keep track of the number of waiting clients, and make the API respond with a 5xx error or similar if
that number gets too high, with the expectation that the client retries - but one concern would be that the load from the additional HTTP traffic would outweigh any efficiency gained by not stacking up Deferreds.
Have you validated this concern through load-testing? You may find that there is no meaningful negative impact to this approach.
So, I'd welcome any advice on better ways to approach the problem.
Richard
Hope this helps,
L. Daniel Burr _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________
Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Thank you for all for all the answers so far, particularly to Ilya and Jean-Paul who provided some very helpful code samples. It's interesting to realise that, by avoiding locking, we can end up with a much more efficient implementation. I'll have to figure out how widely we can apply this technique - and how often it's going to be worth rewriting things to allow that. Thanks for some useful pointers! Richard On 12/03/18 20:00, Jean-Paul Calderone wrote:
On Mon, Mar 12, 2018 at 3:52 PM, Ilya Skriblovsky <ilyaskriblovsky@gmail.com <mailto:ilyaskriblovsky@gmail.com>> wrote:
Hi, Richard,
I've used class like this to cache the result of Expensive Calculation:
class DeferredCache: pending = None result = None failure = None
def __init__(self, expensive_func): self.expensive_func = expensive_func
def __call__(self): if self.pending is None: def on_ready(result): self.result = result def on_fail(failure): self.failure = failure
self.pending = defer.maybeDeferred(self.expensive_func).addCallbacks(on_ready, on_fail)
return self.pending.addCallback(self._return_result)
This seems like basically a correct answer to me. However, I suggest one small change.
You probably want to create and return a new Deferred for each result. If you don't, then your internal `pending` Deferred is now reachable by application code.
As written, an application might (very, very reasonably):
d = getResource() d.addCallback(long_async_operation)
Now `pending` has `long_async_operation` as a callback on its chain. This will prevent anyone else from getting a result until `long_async_operation` is done.
You can fix this by:
result = Deferred() self.pending.addCallback(self._return_result).chainDeferred(result) return result
Now the application can only reach `result`. Nothing they do to `result` will make much difference to `pending` because `chainDeferred` only puts `callback` (and `errback`) onto `pending`'s callback chain. `callback` and `errback` don't wait on anything.
You have to be a little careful with `chainDeferred` because it doesn't have the recursion-avoidance logic that implicit chaining has. However, that doesn't matter in this particular case because the chain depth is fixed at two (`pending` and `result`). The problems only arise if you extend the chain out in this direction without bound.
Jean-Paul
def _return_result(self, _): return self.failure or self.result
Using it you can get rid of DeferredLocks:
deferred_cache = DeferredCache(do_expensive_calculation)
def getResource(): return deferred_cache()
It will start `expensive_func` on the first call. The second and consequtive calls will return deferreds that resolves with the result when expensive_func is done. If you call it when result is already here, it will return alread-fired deferred.
Of course, it will require some more work if you need to pass arguments to `expensive_func` and memoize results per arguments values.
-- ilya
participants (4)
-
Ilya Skriblovsky
-
Jean-Paul Calderone
-
L. Daniel Burr
-
Richard van der Hoff