Hi!

I wrote a mail to this list at the end of 2016 about that. Unfortunately the mailman UI is currently down, so I can't link it. I'll copy it here instead:

Hi!

I was thinking about a way to let "devpi use" select a replica automatically.

This is more of a brain dump for now. My current idea would be this:

The primary server would provide a json file with a list of available replicas. When you invoke "devpi use" on the primary server, devpi-client would look for that list and then somehow select a good replica.

The hard part is the selection of the replica.

A simple solution would be to request the +api route on each replica, which is quick, and we measure the time it took. When we tried all replicas, we use the fastest reply. This has some obvious problems, like having to try all replicas, handling timeouts and momentary slowness of replies. I still think this would be a nice addition. One can still always explicitly "devpi use" a certain replica. IMO "devpi use" can take up to 2-3 seconds for the replica selection without making it painful for normal use.

The mirror selection could also be done server side, by providing a dynamic replica list based on request IP or whatever.

Initially it would be easiest to provide the replica list statically via nginx, because atm the primary only knows the IP address of replicas. This is because most installations use the X-Outside-Url header instead of the --outside-url option to provide more flexibility.

We might also want to provide a way in devpi-client to know which replica belongs to a primary to share the login info. I guess the UUID would be useful for that.

Regards,
Florian Schulze

On 19 May 2018, at 0:39, Brack, Laurent P. wrote:

Hello everyone, 

We have a devpi deployment with several replicas distributed around the world. Sometimes, a developer, say in Australia (with his index set to the local replica), triggers a build which is most likely to occur in North America. We have seen checksum error (due to our so so network infrastructure) when pulling things across the WAN. 

Most people use a homegrown bootstrap script, which among other things, measures the response time between the host and known replicas and lock on the fastest one. This has been working really well for us and has mitigated our network related issues. 

However, some people use the devpi client directly which uses whatever server it’s been configured with. So I was wondering if, perhaps through some client plugin hooks, we could integrate that feature, that is, perform latency measurements and switch to the best replica on the fly.

I am not sure if we could make a generic plugin as it would need to be aware of the replicas available in your deployment, but if we can, then we would release it (if there is an interest of course).

Anyhow, I am just fishing here, but any input/suggestions would be greatly appreciated. 

Thanks in advance.

/Laurent


_______________________________________________
devpi-dev mailing list
devpi-dev@python.org
https://mail.python.org/mm3/mailman3/lists/devpi-dev.python.org/