Re: [devpi-dev] improving concurrency, reliability of devpi-server
On Sat, Nov 07, 2015 at 13:43 +0100, Florian Schulze wrote:
I'd say before we add such high level caches we first try to improve the performance differently.
Identifying bottle-necks and improving overall performance also makes sense. But simple-page serving is special because it's the one thing that is needed by pip and must be handled by devpi-server. Other than serving simple requests for pip i don't think performance is a problem with devpi-server.
As you noticed by the writeup, cache invalidation is hard and adds a lot of complexity. It's much easier to try to do as many things on write that are repeated for each read. That's also some kind of caching, but on a lower level at which we know exactly what to invalidate.
Another idea. If we can quickly get the current serial of each index that is involved for rendering a simple page, then we can add an etag header that is used by a caching proxy like varnish. Then we would only provide a cache key, but the storage and invalidation is handled for us.
If we got all serials of all versiondata's for each based index we'd have a key we could use, also as ETAG. You would still need to special-handle mirroring as mentioned below. FWIW the below cache invalidation is somewhat complex but all of the parts are straight-forward IMHO. And the root "caching" object can be tested fully independently. All items below (except for mirroring) below would then just add a single line of code IISIC. It would also be easy to make caching optional. best, holger
Regards, Florian Schulze
On 7 Nov 2015, at 13:28, holger krekel wrote:
Hi Stephan,
so with the recent PRs we do get rid of "copy_if_mutable" and thanks mostly to your PRs simple-page serving is twice as fast as before, on my machine it's 170 requests per seconds.
I just hacked a simple-project serving cache (without any invalidation which is the hard part) which gets us to ~550 requests per seconds. I think we could get faster even if we bypassed all transaction machinery which is implicitely used for each request. But first steps first.
Regarding cache invalidation i think this is the simplest approach:
- maintain a per-index LRUcache (we already have a utility class for that in keyfs) which maps projectnames to simple pages and use/fill it from the simple page serving as it is now.
- if an index config changes (rare event) kill caches of that index and all inheriting indexes.
- if a projectname changes kill caches of that index and all inheriting cache's entries for this projectname.
- at startup time build a RAM data structure which tells us for each index about all dependent indexes (currently we only have the bases of an index,). This data structure needs to be updated when a "bases" property of an index changes. It also tells us if an index ultimately uses a mirroring index, currently only root/pypi. This data structure can be and should be fully unittested without invoking any devpi machinery. It also needs to be thread safe.
- part of the "do we have a cache-hit" check is to see if we depend on a mirroring index and if it's timeout has been reached. If so we kill the cache and thus let the normal current logic run.
- The data structure which maps index names to per-index LRUcache instances can live on the XOM object.
Any comments, further thoughts on this? We could otherwise put this into an issue for anyone who wants to tackle it (you? :)
holger
On Wed, Nov 04, 2015 at 17:57 +0000, Erb, Stephan wrote:
Hi Holger,
I like the idea of getting rid of copy_if_mutable in some way or the other.
Pyrsistent looks very promising. However, I am not sure if it is thread-safe (doesn't look like it). So we would have to be careful here.
Best Regards, Stephan ________________________________________ From: holger krekel <hol...@merlinux.eu> Sent: Tuesday, November 3, 2015 6:29 PM To: Erb, Stephan Cc: holger krekel; devp...@googlegroups.com Subject: Re: [devpi-dev] improving concurrency, reliability of devpi-server
Hi Stephan,
On Fri, Oct 30, 2015 at 14:10 +0000, Erb, Stephan wrote:
Hi Holger,
in order not to de-rail this discussion any further, I have performed a brain dumb in a separate ticket:
https://bitbucket.org/hpk42/devpi/issues/280/devpi-performance-issues
Thanks. FWIW I am wondering if we could avoid "copy_if_mutable" alltogether. We'd need a recursive dict proxy which does what "copy_if_mutable" does but lazily, e.g.:
d = {"a": [1,2,3], "b": set()} d2 = make_recursive_readonly_proxy(d)
d2["b"] = 3 # would give an readonly error d2["a"].append(4) # would give an readonly error "x" in d2["b"] # would be true ...
Is anybody aware of such a proxy? I found
https://pypi.python.org/pypi/dictproxyhack
but it only offers a non-recursive readonly dict interface so the above readonly-errors would not occur. It's not too hard to do an implementation which suffices for devpi-server purposes but if there is a readymade solid solution we could use it.
FWIW i have also been thinking of using "pyrsistent", a well thought out library to work with immutable data structures:
http://pyrsistent.readthedocs.org/
It would help to avoid some programming errors and allows to avoid accidental modifications. The basic idea is that any modifying operation returns a new reference:
map1 = pyrsistent.m(a=3, b=4) map2 = map1.set("x", 5) map1 pmap({'a': 3}) map2 pmap({'a': 3, 'x': 2})
There is no way to modify the map1 reference.
best, holger
Best Regards, Stephan ________________________________________ From: holger krekel <hol...@merlinux.eu> Sent: Friday, October 30, 2015 11:23 AM To: Erb, Stephan Cc: holger krekel; devp...@googlegroups.com Subject: Re: [devpi-dev] improving concurrency, reliability of devpi-server
Hi Stephan,
On Fri, Oct 30, 2015 at 10:06 +0000, Erb, Stephan wrote:
Hi Holger,
what's your impression on the additional code complexity that will introduce?
There is some increased code complexity but i think we should be able to contain it in separately-testable classes/functions.
We are currently facing an other set of concurrency and performance problems in devpi. We easily have 300 package versions per +simple page of a package. A single request takes 0.2 up to 1 second. When there are multiple concurrent read requests (~10) the latency goes up significantly.
Do you have profiling data for the 300-package per simple page scenarios? Most of the time spent in get_releaselinks?
Still, this problem is manageable and we are working on a few performance patches to improve the situation. However, I fear that the large rework proposed here might make the code more difficult and thus more difficult to tune.
I don't suspect the two efforts clash much. What did you do so far?
That said, we are currently caching at "list of release file links" level and i think it's worthwhile to check if we should rather cache at the simple-page layer. Apart from performance improvements it also has potential to simplify the code if we manage to only have caching at simple-page and not in addition to the releaselinks-caching.
best, holger
Regards, Stephan
________________________________________ From: devp...@googlegroups.com <devp...@googlegroups.com> on behalf of holger krekel <hol...@merlinux.eu> Sent: Thursday, October 29, 2015 2:23 PM To: devpi-dev Subject: [devpi-dev] improving concurrency, reliability of devpi-server
Hi Florian, all,
there are at least three issues that somewhat interelate and share the common topic of service reliability, concurrency and interactions with remote pypi.python.org or devpi masters:
https://bitbucket.org/hpk42/devpi/issues/267/intermittent-assertionerror-in-...
multiple devpi-server processes write to the same (networked shared) file system resulting in failed transaction handling. devpi-server was not designed for it.
https://bitbucket.org/hpk42/devpi/issues/274/recurring-consistency-issues-wi...
under high load database/transaction handling issues arise. (although it's unclear what the precise scenario is, how to replicate)
https://bitbucket.org/hpk42/devpi/issues/208/pip-gets-timeout-on-large-packa...
trying to install an uncached package that originates from pypi.python.org can fail if devpi-server cannot download the package fast enough.
Starting with the last issue, we probably need to re-introduce a way to stream remote files instead of first retrieving it in full and only then starting a client response . This should take into account that there could be two threads (or even two processes) which try to retrieve the same file. This means that we start a response as soon as we got a http return code and them forward-stream the content.
The first two issues could be mitigated by introducing a better read/write transaction separation. background: GET-ting simple pages or release files can cause write transactions in a devpi-server process because we may need to retrive & cache information from pypi.python.org or a devpi-server master. Currently, during the processing of the GET request we at some point promote a READ-transaction into a WRITE-transaction through a call to keyfs.restart_as_write_transaction() and persist what we have. This all happens before the response to the client is returned. "Restarting as write" is somewhat brittle because something might have changed since we started our long-running request.
Strategy and notes on how to mitigate all three issues:
- release files: cache and stream chunks of what we remotely receive, all within a READ-transaction and all within RAM. This should ideally be done in such a way that if multiple threads stream the same file, only one remote http request is done for fetching the file. Otherwise we end up retrieving large files multiple times unneccessarily. After the http response to the client is complete we (try to) write it to sqlite/filesystem so that subsequent requests can work from the local filesystem. Here we need to be careful and consider that we might have multiple writers/streamers. If we discover that where we want to write someone else already has we can simply forget about it.
- simple pages: first retrieve the remote simple page in RAM, process it, serve the full pyramid response and then (try to) cache it after the response is completed. Here we probably don't need to care if multiple threads are trying to retrieve the same simple page because simple pages are not big.
- we cache things in RAM because even for large files it shouldn't matter given that servers typically have multiple gigabytes of RAM. And we can avoid synchronization issues wrt to the file system (see also the first issues where multiple processes write to the file system).
- we always finish response to the client before we attempt to do a write transaction. The write transaction part should be implemented in a separate function to make it clear what kind of state we can rely on and what we must re-check. (currently we do the READ->Write switch in the middle of a view function).
- we also need to review how exactly we open the sqlite DB for writing and if multiple processes correctly serialize on their write attempts, particularly in the multi-process case.
- care must be taken wrt to waitress and nginx configuration and their buffering, see for example: http://www.4byte.cn/question/68410/pyramid-stream-response-body.html
any feedback or thoughts welcome.
holger
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- You received this message because you are subscribed to the Google Groups "devpi-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to devpi-dev+...@googlegroups.com. To post to this group, send email to devp...@googlegroups.com. Visit this group at http://groups.google.com/group/devpi-dev. For more options, visit https://groups.google.com/d/optout.
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- You received this message because you are subscribed to the Google Groups "devpi-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to devpi-dev+...@googlegroups.com. To post to this group, send email to devp...@googlegroups.com. Visit this group at http://groups.google.com/group/devpi-dev. For more options, visit https://groups.google.com/d/optout.
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
Hi Florian, Hi Holger, thanks for your effort over the last few days! I have summarized the recent changes and improvements https://bitbucket.org/hpk42/devpi/issues/280/devpi-performance-issues#commen... I believe it is now important to get those improvements released and into production. Then we can make another assessment on production traffic and see what follow-up improvements are necessary (ETAG-based client/proxy side caching, internal caching structure, ...). My gut feeling tells me that an ETAG-based approach should probably be sufficient, given that it is probably supported by pip. Best Regards, Stephan ________________________________________ From: holger krekel <hol...@merlinux.eu> Sent: Saturday, November 7, 2015 2:51 PM To: Florian Schulze Cc: holger krekel; Erb, Stephan; devp...@googlegroups.com Subject: Re: [devpi-dev] improving concurrency, reliability of devpi-server On Sat, Nov 07, 2015 at 13:43 +0100, Florian Schulze wrote:
I'd say before we add such high level caches we first try to improve the performance differently.
Identifying bottle-necks and improving overall performance also makes sense. But simple-page serving is special because it's the one thing that is needed by pip and must be handled by devpi-server. Other than serving simple requests for pip i don't think performance is a problem with devpi-server.
As you noticed by the writeup, cache invalidation is hard and adds a lot of complexity. It's much easier to try to do as many things on write that are repeated for each read. That's also some kind of caching, but on a lower level at which we know exactly what to invalidate.
Another idea. If we can quickly get the current serial of each index that is involved for rendering a simple page, then we can add an etag header that is used by a caching proxy like varnish. Then we would only provide a cache key, but the storage and invalidation is handled for us.
If we got all serials of all versiondata's for each based index we'd have a key we could use, also as ETAG. You would still need to special-handle mirroring as mentioned below. FWIW the below cache invalidation is somewhat complex but all of the parts are straight-forward IMHO. And the root "caching" object can be tested fully independently. All items below (except for mirroring) below would then just add a single line of code IISIC. It would also be easy to make caching optional. best, holger
Regards, Florian Schulze
On 7 Nov 2015, at 13:28, holger krekel wrote:
Hi Stephan,
so with the recent PRs we do get rid of "copy_if_mutable" and thanks mostly to your PRs simple-page serving is twice as fast as before, on my machine it's 170 requests per seconds.
I just hacked a simple-project serving cache (without any invalidation which is the hard part) which gets us to ~550 requests per seconds. I think we could get faster even if we bypassed all transaction machinery which is implicitely used for each request. But first steps first.
Regarding cache invalidation i think this is the simplest approach:
- maintain a per-index LRUcache (we already have a utility class for that in keyfs) which maps projectnames to simple pages and use/fill it from the simple page serving as it is now.
- if an index config changes (rare event) kill caches of that index and all inheriting indexes.
- if a projectname changes kill caches of that index and all inheriting cache's entries for this projectname.
- at startup time build a RAM data structure which tells us for each index about all dependent indexes (currently we only have the bases of an index,). This data structure needs to be updated when a "bases" property of an index changes. It also tells us if an index ultimately uses a mirroring index, currently only root/pypi. This data structure can be and should be fully unittested without invoking any devpi machinery. It also needs to be thread safe.
- part of the "do we have a cache-hit" check is to see if we depend on a mirroring index and if it's timeout has been reached. If so we kill the cache and thus let the normal current logic run.
- The data structure which maps index names to per-index LRUcache instances can live on the XOM object.
Any comments, further thoughts on this? We could otherwise put this into an issue for anyone who wants to tackle it (you? :)
holger
On Wed, Nov 04, 2015 at 17:57 +0000, Erb, Stephan wrote:
Hi Holger,
I like the idea of getting rid of copy_if_mutable in some way or the other.
Pyrsistent looks very promising. However, I am not sure if it is thread-safe (doesn't look like it). So we would have to be careful here.
Best Regards, Stephan ________________________________________ From: holger krekel <hol...@merlinux.eu> Sent: Tuesday, November 3, 2015 6:29 PM To: Erb, Stephan Cc: holger krekel; devp...@googlegroups.com Subject: Re: [devpi-dev] improving concurrency, reliability of devpi-server
Hi Stephan,
On Fri, Oct 30, 2015 at 14:10 +0000, Erb, Stephan wrote:
Hi Holger,
in order not to de-rail this discussion any further, I have performed a brain dumb in a separate ticket:
https://bitbucket.org/hpk42/devpi/issues/280/devpi-performance-issues
Thanks. FWIW I am wondering if we could avoid "copy_if_mutable" alltogether. We'd need a recursive dict proxy which does what "copy_if_mutable" does but lazily, e.g.:
d = {"a": [1,2,3], "b": set()} d2 = make_recursive_readonly_proxy(d)
d2["b"] = 3 # would give an readonly error d2["a"].append(4) # would give an readonly error "x" in d2["b"] # would be true ...
Is anybody aware of such a proxy? I found
https://pypi.python.org/pypi/dictproxyhack
but it only offers a non-recursive readonly dict interface so the above readonly-errors would not occur. It's not too hard to do an implementation which suffices for devpi-server purposes but if there is a readymade solid solution we could use it.
FWIW i have also been thinking of using "pyrsistent", a well thought out library to work with immutable data structures:
http://pyrsistent.readthedocs.org/
It would help to avoid some programming errors and allows to avoid accidental modifications. The basic idea is that any modifying operation returns a new reference:
map1 = pyrsistent.m(a=3, b=4) map2 = map1.set("x", 5) map1 pmap({'a': 3}) map2 pmap({'a': 3, 'x': 2})
There is no way to modify the map1 reference.
best, holger
Best Regards, Stephan ________________________________________ From: holger krekel <hol...@merlinux.eu> Sent: Friday, October 30, 2015 11:23 AM To: Erb, Stephan Cc: holger krekel; devp...@googlegroups.com Subject: Re: [devpi-dev] improving concurrency, reliability of devpi-server
Hi Stephan,
On Fri, Oct 30, 2015 at 10:06 +0000, Erb, Stephan wrote:
Hi Holger,
what's your impression on the additional code complexity that will introduce?
There is some increased code complexity but i think we should be able to contain it in separately-testable classes/functions.
We are currently facing an other set of concurrency and performance problems in devpi. We easily have 300 package versions per +simple page of a package. A single request takes 0.2 up to 1 second. When there are multiple concurrent read requests (~10) the latency goes up significantly.
Do you have profiling data for the 300-package per simple page scenarios? Most of the time spent in get_releaselinks?
Still, this problem is manageable and we are working on a few performance patches to improve the situation. However, I fear that the large rework proposed here might make the code more difficult and thus more difficult to tune.
I don't suspect the two efforts clash much. What did you do so far?
That said, we are currently caching at "list of release file links" level and i think it's worthwhile to check if we should rather cache at the simple-page layer. Apart from performance improvements it also has potential to simplify the code if we manage to only have caching at simple-page and not in addition to the releaselinks-caching.
best, holger
Regards, Stephan
________________________________________ From: devp...@googlegroups.com <devp...@googlegroups.com> on behalf of holger krekel <hol...@merlinux.eu> Sent: Thursday, October 29, 2015 2:23 PM To: devpi-dev Subject: [devpi-dev] improving concurrency, reliability of devpi-server
Hi Florian, all,
there are at least three issues that somewhat interelate and share the common topic of service reliability, concurrency and interactions with remote pypi.python.org or devpi masters:
https://bitbucket.org/hpk42/devpi/issues/267/intermittent-assertionerror-in-...
multiple devpi-server processes write to the same (networked shared) file system resulting in failed transaction handling. devpi-server was not designed for it.
https://bitbucket.org/hpk42/devpi/issues/274/recurring-consistency-issues-wi...
under high load database/transaction handling issues arise. (although it's unclear what the precise scenario is, how to replicate)
https://bitbucket.org/hpk42/devpi/issues/208/pip-gets-timeout-on-large-packa...
trying to install an uncached package that originates from pypi.python.org can fail if devpi-server cannot download the package fast enough.
Starting with the last issue, we probably need to re-introduce a way to stream remote files instead of first retrieving it in full and only then starting a client response . This should take into account that there could be two threads (or even two processes) which try to retrieve the same file. This means that we start a response as soon as we got a http return code and them forward-stream the content.
The first two issues could be mitigated by introducing a better read/write transaction separation. background: GET-ting simple pages or release files can cause write transactions in a devpi-server process because we may need to retrive & cache information from pypi.python.org or a devpi-server master. Currently, during the processing of the GET request we at some point promote a READ-transaction into a WRITE-transaction through a call to keyfs.restart_as_write_transaction() and persist what we have. This all happens before the response to the client is returned. "Restarting as write" is somewhat brittle because something might have changed since we started our long-running request.
Strategy and notes on how to mitigate all three issues:
- release files: cache and stream chunks of what we remotely receive, all within a READ-transaction and all within RAM. This should ideally be done in such a way that if multiple threads stream the same file, only one remote http request is done for fetching the file. Otherwise we end up retrieving large files multiple times unneccessarily. After the http response to the client is complete we (try to) write it to sqlite/filesystem so that subsequent requests can work from the local filesystem. Here we need to be careful and consider that we might have multiple writers/streamers. If we discover that where we want to write someone else already has we can simply forget about it.
- simple pages: first retrieve the remote simple page in RAM, process it, serve the full pyramid response and then (try to) cache it after the response is completed. Here we probably don't need to care if multiple threads are trying to retrieve the same simple page because simple pages are not big.
- we cache things in RAM because even for large files it shouldn't matter given that servers typically have multiple gigabytes of RAM. And we can avoid synchronization issues wrt to the file system (see also the first issues where multiple processes write to the file system).
- we always finish response to the client before we attempt to do a write transaction. The write transaction part should be implemented in a separate function to make it clear what kind of state we can rely on and what we must re-check. (currently we do the READ->Write switch in the middle of a view function).
- we also need to review how exactly we open the sqlite DB for writing and if multiple processes correctly serialize on their write attempts, particularly in the multi-process case.
- care must be taken wrt to waitress and nginx configuration and their buffering, see for example: http://www.4byte.cn/question/68410/pyramid-stream-response-body.html
any feedback or thoughts welcome.
holger
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- You received this message because you are subscribed to the Google Groups "devpi-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to devpi-dev+...@googlegroups.com. To post to this group, send email to devp...@googlegroups.com. Visit this group at http://groups.google.com/group/devpi-dev. For more options, visit https://groups.google.com/d/optout.
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- You received this message because you are subscribed to the Google Groups "devpi-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to devpi-dev+...@googlegroups.com. To post to this group, send email to devp...@googlegroups.com. Visit this group at http://groups.google.com/group/devpi-dev. For more options, visit https://groups.google.com/d/optout.
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
Hi Stephan, we just released everything :) devpi-server-2.4.0 simple-page serving for pip/easy_install is now roughly 3 times faster compared to devpi-server-2.3.1. At this point, i am not sure how much simple-caching would really help anymore. Time for some new measuring and a fresh look. Let's see what your deployment yields and then discuss. There are a few small things i have in my patch queue but they only yield a 5-10 percent improvement and i didn't get them into the release. best, holger On Wed, Nov 11, 2015 at 12:44 +0000, Erb, Stephan wrote:
Hi Florian, Hi Holger,
thanks for your effort over the last few days! I have summarized the recent changes and improvements https://bitbucket.org/hpk42/devpi/issues/280/devpi-performance-issues#commen...
I believe it is now important to get those improvements released and into production. Then we can make another assessment on production traffic and see what follow-up improvements are necessary (ETAG-based client/proxy side caching, internal caching structure, ...). My gut feeling tells me that an ETAG-based approach should probably be sufficient, given that it is probably supported by pip.
Best Regards, Stephan ________________________________________ From: holger krekel <hol...@merlinux.eu> Sent: Saturday, November 7, 2015 2:51 PM To: Florian Schulze Cc: holger krekel; Erb, Stephan; devp...@googlegroups.com Subject: Re: [devpi-dev] improving concurrency, reliability of devpi-server
On Sat, Nov 07, 2015 at 13:43 +0100, Florian Schulze wrote:
I'd say before we add such high level caches we first try to improve the performance differently.
Identifying bottle-necks and improving overall performance also makes sense. But simple-page serving is special because it's the one thing that is needed by pip and must be handled by devpi-server. Other than serving simple requests for pip i don't think performance is a problem with devpi-server.
As you noticed by the writeup, cache invalidation is hard and adds a lot of complexity. It's much easier to try to do as many things on write that are repeated for each read. That's also some kind of caching, but on a lower level at which we know exactly what to invalidate.
Another idea. If we can quickly get the current serial of each index that is involved for rendering a simple page, then we can add an etag header that is used by a caching proxy like varnish. Then we would only provide a cache key, but the storage and invalidation is handled for us.
If we got all serials of all versiondata's for each based index we'd have a key we could use, also as ETAG. You would still need to special-handle mirroring as mentioned below.
FWIW the below cache invalidation is somewhat complex but all of the parts are straight-forward IMHO. And the root "caching" object can be tested fully independently. All items below (except for mirroring) below would then just add a single line of code IISIC. It would also be easy to make caching optional.
best, holger
Regards, Florian Schulze
On 7 Nov 2015, at 13:28, holger krekel wrote:
Hi Stephan,
so with the recent PRs we do get rid of "copy_if_mutable" and thanks mostly to your PRs simple-page serving is twice as fast as before, on my machine it's 170 requests per seconds.
I just hacked a simple-project serving cache (without any invalidation which is the hard part) which gets us to ~550 requests per seconds. I think we could get faster even if we bypassed all transaction machinery which is implicitely used for each request. But first steps first.
Regarding cache invalidation i think this is the simplest approach:
- maintain a per-index LRUcache (we already have a utility class for that in keyfs) which maps projectnames to simple pages and use/fill it from the simple page serving as it is now.
- if an index config changes (rare event) kill caches of that index and all inheriting indexes.
- if a projectname changes kill caches of that index and all inheriting cache's entries for this projectname.
- at startup time build a RAM data structure which tells us for each index about all dependent indexes (currently we only have the bases of an index,). This data structure needs to be updated when a "bases" property of an index changes. It also tells us if an index ultimately uses a mirroring index, currently only root/pypi. This data structure can be and should be fully unittested without invoking any devpi machinery. It also needs to be thread safe.
- part of the "do we have a cache-hit" check is to see if we depend on a mirroring index and if it's timeout has been reached. If so we kill the cache and thus let the normal current logic run.
- The data structure which maps index names to per-index LRUcache instances can live on the XOM object.
Any comments, further thoughts on this? We could otherwise put this into an issue for anyone who wants to tackle it (you? :)
holger
On Wed, Nov 04, 2015 at 17:57 +0000, Erb, Stephan wrote:
Hi Holger,
I like the idea of getting rid of copy_if_mutable in some way or the other.
Pyrsistent looks very promising. However, I am not sure if it is thread-safe (doesn't look like it). So we would have to be careful here.
Best Regards, Stephan ________________________________________ From: holger krekel <hol...@merlinux.eu> Sent: Tuesday, November 3, 2015 6:29 PM To: Erb, Stephan Cc: holger krekel; devp...@googlegroups.com Subject: Re: [devpi-dev] improving concurrency, reliability of devpi-server
Hi Stephan,
On Fri, Oct 30, 2015 at 14:10 +0000, Erb, Stephan wrote:
Hi Holger,
in order not to de-rail this discussion any further, I have performed a brain dumb in a separate ticket:
https://bitbucket.org/hpk42/devpi/issues/280/devpi-performance-issues
Thanks. FWIW I am wondering if we could avoid "copy_if_mutable" alltogether. We'd need a recursive dict proxy which does what "copy_if_mutable" does but lazily, e.g.:
d = {"a": [1,2,3], "b": set()} d2 = make_recursive_readonly_proxy(d)
d2["b"] = 3 # would give an readonly error d2["a"].append(4) # would give an readonly error "x" in d2["b"] # would be true ...
Is anybody aware of such a proxy? I found
https://pypi.python.org/pypi/dictproxyhack
but it only offers a non-recursive readonly dict interface so the above readonly-errors would not occur. It's not too hard to do an implementation which suffices for devpi-server purposes but if there is a readymade solid solution we could use it.
FWIW i have also been thinking of using "pyrsistent", a well thought out library to work with immutable data structures:
http://pyrsistent.readthedocs.org/
It would help to avoid some programming errors and allows to avoid accidental modifications. The basic idea is that any modifying operation returns a new reference:
> map1 = pyrsistent.m(a=3, b=4) > map2 = map1.set("x", 5) > map1 pmap({'a': 3}) > map2 pmap({'a': 3, 'x': 2})
There is no way to modify the map1 reference.
best, holger
Best Regards, Stephan ________________________________________ From: holger krekel <hol...@merlinux.eu> Sent: Friday, October 30, 2015 11:23 AM To: Erb, Stephan Cc: holger krekel; devp...@googlegroups.com Subject: Re: [devpi-dev] improving concurrency, reliability of devpi-server
Hi Stephan,
On Fri, Oct 30, 2015 at 10:06 +0000, Erb, Stephan wrote:
Hi Holger,
what's your impression on the additional code complexity that will introduce?
There is some increased code complexity but i think we should be able to contain it in separately-testable classes/functions.
We are currently facing an other set of concurrency and performance problems in devpi. We easily have 300 package versions per +simple page of a package. A single request takes 0.2 up to 1 second. When there are multiple concurrent read requests (~10) the latency goes up significantly.
Do you have profiling data for the 300-package per simple page scenarios? Most of the time spent in get_releaselinks?
Still, this problem is manageable and we are working on a few performance patches to improve the situation. However, I fear that the large rework proposed here might make the code more difficult and thus more difficult to tune.
I don't suspect the two efforts clash much. What did you do so far?
That said, we are currently caching at "list of release file links" level and i think it's worthwhile to check if we should rather cache at the simple-page layer. Apart from performance improvements it also has potential to simplify the code if we manage to only have caching at simple-page and not in addition to the releaselinks-caching.
best, holger
Regards, Stephan
________________________________________ From: devp...@googlegroups.com <devp...@googlegroups.com> on behalf of holger krekel <hol...@merlinux.eu> Sent: Thursday, October 29, 2015 2:23 PM To: devpi-dev Subject: [devpi-dev] improving concurrency, reliability of devpi-server
Hi Florian, all,
there are at least three issues that somewhat interelate and share the common topic of service reliability, concurrency and interactions with remote pypi.python.org or devpi masters:
https://bitbucket.org/hpk42/devpi/issues/267/intermittent-assertionerror-in-...
multiple devpi-server processes write to the same (networked shared) file system resulting in failed transaction handling. devpi-server was not designed for it.
https://bitbucket.org/hpk42/devpi/issues/274/recurring-consistency-issues-wi...
under high load database/transaction handling issues arise. (although it's unclear what the precise scenario is, how to replicate)
https://bitbucket.org/hpk42/devpi/issues/208/pip-gets-timeout-on-large-packa...
trying to install an uncached package that originates from pypi.python.org can fail if devpi-server cannot download the package fast enough.
Starting with the last issue, we probably need to re-introduce a way to stream remote files instead of first retrieving it in full and only then starting a client response . This should take into account that there could be two threads (or even two processes) which try to retrieve the same file. This means that we start a response as soon as we got a http return code and them forward-stream the content.
The first two issues could be mitigated by introducing a better read/write transaction separation. background: GET-ting simple pages or release files can cause write transactions in a devpi-server process because we may need to retrive & cache information from pypi.python.org or a devpi-server master. Currently, during the processing of the GET request we at some point promote a READ-transaction into a WRITE-transaction through a call to keyfs.restart_as_write_transaction() and persist what we have. This all happens before the response to the client is returned. "Restarting as write" is somewhat brittle because something might have changed since we started our long-running request.
Strategy and notes on how to mitigate all three issues:
- release files: cache and stream chunks of what we remotely receive, all within a READ-transaction and all within RAM. This should ideally be done in such a way that if multiple threads stream the same file, only one remote http request is done for fetching the file. Otherwise we end up retrieving large files multiple times unneccessarily. After the http response to the client is complete we (try to) write it to sqlite/filesystem so that subsequent requests can work from the local filesystem. Here we need to be careful and consider that we might have multiple writers/streamers. If we discover that where we want to write someone else already has we can simply forget about it.
- simple pages: first retrieve the remote simple page in RAM, process it, serve the full pyramid response and then (try to) cache it after the response is completed. Here we probably don't need to care if multiple threads are trying to retrieve the same simple page because simple pages are not big.
- we cache things in RAM because even for large files it shouldn't matter given that servers typically have multiple gigabytes of RAM. And we can avoid synchronization issues wrt to the file system (see also the first issues where multiple processes write to the file system).
- we always finish response to the client before we attempt to do a write transaction. The write transaction part should be implemented in a separate function to make it clear what kind of state we can rely on and what we must re-check. (currently we do the READ->Write switch in the middle of a view function).
- we also need to review how exactly we open the sqlite DB for writing and if multiple processes correctly serialize on their write attempts, particularly in the multi-process case.
- care must be taken wrt to waitress and nginx configuration and their buffering, see for example: http://www.4byte.cn/question/68410/pyramid-stream-response-body.html
any feedback or thoughts welcome.
holger
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- You received this message because you are subscribed to the Google Groups "devpi-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to devpi-dev+...@googlegroups.com. To post to this group, send email to devp...@googlegroups.com. Visit this group at http://groups.google.com/group/devpi-dev. For more options, visit https://groups.google.com/d/optout.
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- You received this message because you are subscribed to the Google Groups "devpi-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to devpi-dev+...@googlegroups.com. To post to this group, send email to devp...@googlegroups.com. Visit this group at http://groups.google.com/group/devpi-dev. For more options, visit https://groups.google.com/d/optout.
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
-- about me: http://holgerkrekel.net/about-me/ contracting: http://merlinux.eu
participants (2)
-
Erb, Stephan
-
holger krekel