It is great that you are investing keeping Devpi up to speed with increasing data sizes. However, Stephan Erb and I are not yet convinced that the introduction of the new cache type is the best approach to this problem. Our prime reason is that we expect this to significantly increase code complexity while other approaches you mention will solve the issue as well and seem like they could be built mostly using functionality already in the code base.
We also see issues with our current set-up if `root/pypi` were changed to caching index. We specifically use Devpi to be independent of our outside connectivity and we load balance between replicas. In this set-up, allowing replicas to get into an inconsistent state will likely cause hard to track down issues in case of connectivity loss.
In addition, in our set-up we also use Devpi-to-Devpi mirroring. Thus, any approach that benefits the mirroring mechanism will also allow us to profit from it there. In contrast, moving `root/pypi` to a different mechanism adds the danger of the mirroring code becoming less battle tested and thus less reliable.
We would much prefer the changes to the replication mechanism you mentioned. Especially compaction sounds like a very useful feature. It would not only speed up recovery of replicas, but having a smaller DB state should also be beneficial to the operation of Devpi itself. In addition, as we semi-regularly prune old `.devX` releases from our instances to reduce backup sizes, this would also allow us to profit from this with our other mirror indices. Because of this, for us it would even be interesting to compact non-mirror indices. But, as the critical systems only get those packages via mirror indices, this is of less importance.
As we see it, a lot of the building blocks required for the compaction mechanism already exist in the code base. E.g. we could imagine the state representation to reuse parts of the export mechanism. Thus, complexity is less of an issue with this. We even expect this to help with a class of errors we ran into in the past, where the code stumbled in case distributions had been added and removed before the replica was connected, as those distributions would no longer be present in the compacted state. The only real caveat we see is to make sure that a replica that has been disconnected for a while does not block compaction but you have already hinted at this case in your mail so we don't expect this to become an issue.
In case this simplifies implementation, for us it would be fully feasible to only run the compaction offline. We already take our masters offline each night to perform a filesystem snapshot for backup purposes. Thus, doing the same to perform compaction would not create an operational problem for us. It would probably even provide us with us better control over and monitoring of the compaction process compared to some on-the-fly implementation.
So, tldr, while adding adding a cache index type might seem like the quicker solution, code complexity probably levels out the implementation cost of both approaches and we see significant additional benefits in the compaction approach.
Kind Regards, Matthias
Dr. Matthias Bach Senior Software Engineer
email@example.com T +49 721 383117 6244
Blue Yonder GmbH Ohiostraße 8 76149 Karlsruhe
Registergericht Mannheim, HRB 704547 · USt-IdNr. DE 277 091 535 · Geschäftsführer: Uwe Weiss (CEO), Jochen Bossert
Diese E-Mail enthaelt vertrauliche oder geschuetzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.
This e-mail may contain confidential and/or privileged information. If you are not the recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
Am 21.07.18, 11:26 schrieb "Florian Schulze" firstname.lastname@example.org:
tldr: Quick solution would be changing root/pypi from a "mirror" to a "cache" which doesn't store anything in the DB and isn't replicated. The harder solution would be changes to the replication protocol and DB storage.
PyPI is growing quickly and that growth continues to affect the current way devpi handles root/pypi. We already had to make changes in the past due to the number of projects on PyPI.
First an overview of my understanding of why root/pypi works the way it does in devpi:
In the past PyPI was unreliable. It was down or slow quite often and that caused issues in day to day use. Several tools started adding mitigations for that. For example zc.buildout was one of the first tools that started caching data from PyPI locally to reduce download time and make repeated installations quicker.
Because devpi already provided the necessary data for pip and other tools, it made sense to cache PyPI data locally.
At some point replication was added to devpi. In regard to root/pypi, the thinking was, that all data the master already had from PyPI should be copied to the replicas, so all instances could provide the same view and once an installation with pip has worked, it should continue to work, regardless of which replica was used.
Now to the issues.
For replicas to be able to replicate all data easily and reliably without having to download all data when out of sync, we use serialized changesets. For the list of names from PyPI we stored that list in the DB. Problem was that the list was stored in full each time it changed. Because new projects are added all the time, that list took up quite a bit of space in the DB. We then changed it by only keeping the list in RAM.
Now the next issue is the growing number of releases per project. We store the infos from the links on the simple page of each project that was installed in the past. Whenever it is accessed again, it is updated when there are new releases. For a busy devpi instance that data can grow quite large.
We also store all accessed release files.
With devpi-web we added indexing for search. Here the number of projects starts to become an issue as well. A new devpi instance downloads the names of all projects on PyPI and indexes them. As of this writing, these are ~150000 names. Writing that index on the first commit takes several minutes with the Whoosh backend we currently have.
Storage also becomes an issue. Long running devpi instances have a constantly growing database and pile of package files. And currently there is no official way to clean that up and the workarounds can cause unforseen issues.
Now the question is where to go from here.
One idea is adding a "cache" index type, which doesn't replicate and either doesn't index at all, or only indexes currently cached data. Such an index would change behaviour of replicas, because each replica would have a different state of cached data.
Another solution would involve changing the replication protocol or at least some assumptions about it.
Currently replicas walk through all state changes of the master step by step and redo everything that happend to get to the current state. This is pretty wasteful most of the time. A new replica should get to the current state as quickly as possible. The full metadata of the current state isn't big. The biggest part are the release files. A new replica could get the full data for the current serial and then follow the individual changes from there. Fetching of the release files would work the same as it does now, except the initial list would be much bigger.
Once that works, we could start removing data from old serials. This can be a triggered operation like vacuum on databases and at a later point it might be possible to automate it. The master already keeps track of connected replicas. So it's pretty easy to check what the oldest synced serial of the replicas is and remove older ones. New replicas would get the full data of the current serial in one go. If there are replicas which have been out of sync for a longer time, they would also get a full set of metadata but can keep the release files they already have and update them.
We could and most likely should limit these cleanups to the mirror indexes and maybe deleted indexes.
Both solutions would solve the storage issue in different ways. The biggest problem would still be the indexing. Which will be in another mail.
Regards, Florian Schulze _______________________________________________ devpi-dev mailing list -- email@example.com To unsubscribe send an email to firstname.lastname@example.org https://mail.python.org/mm3/mailman3/lists/devpi-dev.python.org/