Strange errors: "While handling ... project does not exist"
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com." We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/ That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch. However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following: [req78237] [Rtx14602] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : The project internal_project_name does not exist. What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server. Thanks, Nick
On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package. Regards, Florian Schulze
That request had already rolled off, so I had to search for a new one. Here's this: $ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep req8603 2020-01-03 13:22:19,560 INFO: [req8603] GET /root/pypi/+simple/internal_project_name/ 2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : I'm trying to troubleshoot that further, so I grepped for the URI, which nothing should be requesting: $ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "/root/pypi/+simple/internal_project_name/" 2020-01-02 18:10:53,164 INFO: [req82] GET /root/pypi/+simple/internal_project_name/ 2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET /root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443" That IP address from which that request came is the IP address of one of the mirror servers. When I grep the mirror logs for that internal project name, there is nothing indicating that it is looking in root/pypi; only org_name/stable: $ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep internal_project_name 2020-01-02 18:10:53,150 INFO: [req37] GET /org_name/stable/+simple/internal_project_name/ "172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET /org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-" 0.101 "http/https/https" " devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443" So our builds (locally and on Jenkins) are requesting /org_name/stable/+simple/internal_project_name/ of the mirrors, but then the mirrors are requesting /root/pypi/+simple/internal_project_name/ of the master. But they must at some point be requesting the correct URL, because the mirrors have all of the proper projects in org_name/stable, and you can install them all just fine using the mirrors. On the master, the indexes are configured as follows (org_name/stable extends org_name/upstream extends root/pypi): $ devpi index -v org_name/stable http://127.0.0.1/org_name/stable: type=stage bases=org_name/upstream volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist= $ devpi index -v org_name/upstream http://127.0.0.1/org_name/upstream: type=stage bases=root/pypi volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=* The mirror/replica servers are configured like this: devpi-server: serverdir: /data/devpi-server/server-data secretfile: /data/devpi-server/server-data/.secret unix-socket: /tmp/devpi-server.sock unix-socket-perms: "0666" threads: 16 replica-max-retries: 2 request-timeout: 5 root-passwd: ****************** theme: semantic-ui role: replica master-url: https://primary-devpi-server.example.com Hope that additional information helps. Thanks, Nick On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze <mail@florian-schulze.net> wrote:
On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
This might help, too: Evidence that the mirrors ARE using org_name/stable to fetch new resources, in addition to (for some reason) looking in root/pypi: $ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "org_name/stable" ... "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:13:16 +0000] "GET /org_name/stable/+f/bc8/552112d1152f5/a_different_project-9.6.2-py2-none-any.whl HTTP/1.1" 200 1006276 "-" 0.000 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443" "172.27.32.145" (172.27.32.145) - [02/Jan/2020:18:13:17 +0000] "GET /org_name/stable/+f/bc8/552112d1152f5/a_different_project-9.6.2-py2-none-any.whl HTTP/1.1" 200 1006276 "-" 0.000 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443" ... Nick On Fri, Jan 3, 2020 at 7:48 AM Nicholas Williams < nicholas@nicholaswilliams.net> wrote:
That request had already rolled off, so I had to search for a new one. Here's this:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep req8603 2020-01-03 13:22:19,560 INFO: [req8603] GET /root/pypi/+simple/internal_project_name/ 2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... :
I'm trying to troubleshoot that further, so I grepped for the URI, which nothing should be requesting:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "/root/pypi/+simple/internal_project_name/" 2020-01-02 18:10:53,164 INFO: [req82] GET /root/pypi/+simple/internal_project_name/ 2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET /root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443"
That IP address from which that request came is the IP address of one of the mirror servers. When I grep the mirror logs for that internal project name, there is nothing indicating that it is looking in root/pypi; only org_name/stable:
$ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep internal_project_name 2020-01-02 18:10:53,150 INFO: [req37] GET /org_name/stable/+simple/internal_project_name/ "172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET /org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-" 0.101 "http/https/https" " devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443"
So our builds (locally and on Jenkins) are requesting /org_name/stable/+simple/internal_project_name/ of the mirrors, but then the mirrors are requesting /root/pypi/+simple/internal_project_name/ of the master. But they must at some point be requesting the correct URL, because the mirrors have all of the proper projects in org_name/stable, and you can install them all just fine using the mirrors.
On the master, the indexes are configured as follows (org_name/stable extends org_name/upstream extends root/pypi):
$ devpi index -v org_name/stable http://127.0.0.1/org_name/stable: type=stage bases=org_name/upstream volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=
$ devpi index -v org_name/upstream http://127.0.0.1/org_name/upstream: type=stage bases=root/pypi volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=*
The mirror/replica servers are configured like this:
devpi-server: serverdir: /data/devpi-server/server-data secretfile: /data/devpi-server/server-data/.secret unix-socket: /tmp/devpi-server.sock unix-socket-perms: "0666" threads: 16 replica-max-retries: 2 request-timeout: 5 root-passwd: ****************** theme: semantic-ui role: replica master-url: https://primary-devpi-server.example.com
Hope that additional information helps.
Thanks,
Nick
On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze <mail@florian-schulze.net> wrote:
On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
Hi! I dug a bit deeper. This happens in devpi_server/views.py simple_list_project. We check the user agent header to do speed ups for installers like pip. When that doesn't match, additional info is added to the page. One of that is generated by get_mirror_whitelist_info. It has to check the existence of the project in the mirror to show whether the project blocks releases from the mirror or not. This is useful info, but triggers a request to the mirror, which is what you see. You should only see this when going to the simple page in a browser, installers like pip shouldn't trigger it. We only check for these strings in user agent though: distribute setuptools pip pex, so if that doesn't match get_mirror_whitelist_info will be triggered. I'm thinking about the implications of that, as it exposes internal package names to the mirror provider. Regards, Florian Schulze On 3 Jan 2020, at 14:48, Nicholas Williams wrote:
That request had already rolled off, so I had to search for a new one. Here's this:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep req8603 2020-01-03 13:22:19,560 INFO: [req8603] GET /root/pypi/+simple/internal_project_name/ 2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... :
I'm trying to troubleshoot that further, so I grepped for the URI, which nothing should be requesting:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "/root/pypi/+simple/internal_project_name/" 2020-01-02 18:10:53,164 INFO: [req82] GET /root/pypi/+simple/internal_project_name/ 2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET /root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443"
That IP address from which that request came is the IP address of one of the mirror servers. When I grep the mirror logs for that internal project name, there is nothing indicating that it is looking in root/pypi; only org_name/stable:
$ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep internal_project_name 2020-01-02 18:10:53,150 INFO: [req37] GET /org_name/stable/+simple/internal_project_name/ "172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET /org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-" 0.101 "http/https/https" " devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443"
So our builds (locally and on Jenkins) are requesting /org_name/stable/+simple/internal_project_name/ of the mirrors, but then the mirrors are requesting /root/pypi/+simple/internal_project_name/ of the master. But they must at some point be requesting the correct URL, because the mirrors have all of the proper projects in org_name/stable, and you can install them all just fine using the mirrors.
On the master, the indexes are configured as follows (org_name/stable extends org_name/upstream extends root/pypi):
$ devpi index -v org_name/stable http://127.0.0.1/org_name/stable: type=stage bases=org_name/upstream volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=
$ devpi index -v org_name/upstream http://127.0.0.1/org_name/upstream: type=stage bases=root/pypi volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=*
The mirror/replica servers are configured like this:
devpi-server: serverdir: /data/devpi-server/server-data secretfile: /data/devpi-server/server-data/.secret unix-socket: /tmp/devpi-server.sock unix-socket-perms: "0666" threads: 16 replica-max-retries: 2 request-timeout: 5 root-passwd: ****************** theme: semantic-ui role: replica master-url: https://primary-devpi-server.example.com
Hope that additional information helps.
Thanks,
Nick
On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze <mail@florian-schulze.net> wrote:
On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
I see a few options here, which aren't mutually exclusive: - Fix the user agent matching, because it appears broken somehow. The user agent header in the logs is as follows: pip/19.3.1 {\x22ci\x22:null,\x22cpu\x22:\x22x86_64\x22,\x22distro\x22:{\x22name\x22:\x22macOS\x22,\x22version\x22:\x2210.13.6\x22},\x22implementation\x22:{\x22name\x22:\x22CPython\x22,\x22version\x22:\x223.7.4\x22},\x22installer\x22:{\x22name\x22:\x22pip\x22,\x22version\x22:\x2219.3.1\x22},\x22openssl_version\x22:\x22OpenSSL 1.0.2s 28 May 2019\x22,\x22python\x22:\x223.7.4\x22,\x22setuptools_version\x22:\x2244.0.0\x22,\x22system\x22:{\x22name\x22:\x22Darwin\x22,\x22release\x22:\x2217.7.0\x22}} But Devpi is still exhibiting this behavior, so somehow it's not matching on that user agent header properly. - Provide a setting to disable this behavior (a strong preference of mine), partly to prevent these errors, but also because (as you said) it exposes internal package names to the mirror provider. - Provide a way to disable logging an error when a request is made to a project's simple index that does not exist. I would disable it on the master (because I don't care if the mirrors make this request) but leave it enabled on the mirrors (because such errors would indicate a client error). On Wed, Jan 8, 2020 at 4:04 AM Florian Schulze <mail@florian-schulze.net> wrote:
Hi!
I dug a bit deeper.
This happens in devpi_server/views.py simple_list_project. We check the user agent header to do speed ups for installers like pip. When that doesn't match, additional info is added to the page. One of that is generated by get_mirror_whitelist_info. It has to check the existence of the project in the mirror to show whether the project blocks releases from the mirror or not. This is useful info, but triggers a request to the mirror, which is what you see. You should only see this when going to the simple page in a browser, installers like pip shouldn't trigger it. We only check for these strings in user agent though: distribute setuptools pip pex, so if that doesn't match get_mirror_whitelist_info will be triggered.
I'm thinking about the implications of that, as it exposes internal package names to the mirror provider.
Regards, Florian Schulze
On 3 Jan 2020, at 14:48, Nicholas Williams wrote:
That request had already rolled off, so I had to search for a new one. Here's this:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep req8603 2020-01-03 13:22:19,560 INFO: [req8603] GET /root/pypi/+simple/internal_project_name/ 2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... :
I'm trying to troubleshoot that further, so I grepped for the URI, which nothing should be requesting:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "/root/pypi/+simple/internal_project_name/" 2020-01-02 18:10:53,164 INFO: [req82] GET /root/pypi/+simple/internal_project_name/ 2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET /root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443"
That IP address from which that request came is the IP address of one of the mirror servers. When I grep the mirror logs for that internal project name, there is nothing indicating that it is looking in root/pypi; only org_name/stable:
$ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep internal_project_name 2020-01-02 18:10:53,150 INFO: [req37] GET /org_name/stable/+simple/internal_project_name/ "172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET /org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-" 0.101 "http/https/https" " devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443"
So our builds (locally and on Jenkins) are requesting /org_name/stable/+simple/internal_project_name/ of the mirrors, but then the mirrors are requesting /root/pypi/+simple/internal_project_name/ of the master. But they must at some point be requesting the correct URL, because the mirrors have all of the proper projects in org_name/stable, and you can install them all just fine using the mirrors.
On the master, the indexes are configured as follows (org_name/stable extends org_name/upstream extends root/pypi):
$ devpi index -v org_name/stable http://127.0.0.1/org_name/stable: type=stage bases=org_name/upstream volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=
$ devpi index -v org_name/upstream http://127.0.0.1/org_name/upstream: type=stage bases=root/pypi volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=*
The mirror/replica servers are configured like this:
devpi-server: serverdir: /data/devpi-server/server-data secretfile: /data/devpi-server/server-data/.secret unix-socket: /tmp/devpi-server.sock unix-socket-perms: "0666" threads: 16 replica-max-retries: 2 request-timeout: 5 root-passwd: ****************** theme: semantic-ui role: replica master-url: https://primary-devpi-server.example.com
Hope that additional information helps.
Thanks,
Nick
On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze <mail@florian-schulze.net> wrote:
On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
To be clear on that first point, that user agent was in the Nginx logs, and Nginx is configured to pass all headers on to Devpi, so that should also be the user agent that Devpi gets. Also, I just re-confirmed that if I do something as simple as `pip install -U internal_project`, I get that error logged on the master. Also, if I go to https://devpi.example.com/eventbrite/stable/+simple/another_internal_project... in my browser, I get that error logged again (for the different project) on the master. So, for whatever reason, the user agent matching is definitely not working. Thanks, Nick On Wed, Jan 8, 2020 at 6:08 AM Nicholas Williams < nicholas@nicholaswilliams.net> wrote:
I see a few options here, which aren't mutually exclusive:
- Fix the user agent matching, because it appears broken somehow. The user agent header in the logs is as follows:
pip/19.3.1 {\x22ci\x22:null,\x22cpu\x22:\x22x86_64\x22,\x22distro\x22:{\x22name\x22:\x22macOS\x22,\x22version\x22:\x2210.13.6\x22},\x22implementation\x22:{\x22name\x22:\x22CPython\x22,\x22version\x22:\x223.7.4\x22},\x22installer\x22:{\x22name\x22:\x22pip\x22,\x22version\x22:\x2219.3.1\x22},\x22openssl_version\x22:\x22OpenSSL 1.0.2s 28 May 2019\x22,\x22python\x22:\x223.7.4\x22,\x22setuptools_version\x22:\x2244.0.0\x22,\x22system\x22:{\x22name\x22:\x22Darwin\x22,\x22release\x22:\x2217.7.0\x22}}
But Devpi is still exhibiting this behavior, so somehow it's not matching on that user agent header properly.
- Provide a setting to disable this behavior (a strong preference of mine), partly to prevent these errors, but also because (as you said) it exposes internal package names to the mirror provider.
- Provide a way to disable logging an error when a request is made to a project's simple index that does not exist. I would disable it on the master (because I don't care if the mirrors make this request) but leave it enabled on the mirrors (because such errors would indicate a client error).
On Wed, Jan 8, 2020 at 4:04 AM Florian Schulze <mail@florian-schulze.net> wrote:
Hi!
I dug a bit deeper.
This happens in devpi_server/views.py simple_list_project. We check the user agent header to do speed ups for installers like pip. When that doesn't match, additional info is added to the page. One of that is generated by get_mirror_whitelist_info. It has to check the existence of the project in the mirror to show whether the project blocks releases from the mirror or not. This is useful info, but triggers a request to the mirror, which is what you see. You should only see this when going to the simple page in a browser, installers like pip shouldn't trigger it. We only check for these strings in user agent though: distribute setuptools pip pex, so if that doesn't match get_mirror_whitelist_info will be triggered.
I'm thinking about the implications of that, as it exposes internal package names to the mirror provider.
Regards, Florian Schulze
On 3 Jan 2020, at 14:48, Nicholas Williams wrote:
That request had already rolled off, so I had to search for a new one. Here's this:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep req8603 2020-01-03 13:22:19,560 INFO: [req8603] GET /root/pypi/+simple/internal_project_name/ 2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... :
I'm trying to troubleshoot that further, so I grepped for the URI, which nothing should be requesting:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "/root/pypi/+simple/internal_project_name/" 2020-01-02 18:10:53,164 INFO: [req82] GET /root/pypi/+simple/internal_project_name/ 2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET /root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443"
That IP address from which that request came is the IP address of one of the mirror servers. When I grep the mirror logs for that internal project name, there is nothing indicating that it is looking in root/pypi; only org_name/stable:
$ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep internal_project_name 2020-01-02 18:10:53,150 INFO: [req37] GET /org_name/stable/+simple/internal_project_name/ "172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET /org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-" 0.101 "http/https/https" " devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443"
So our builds (locally and on Jenkins) are requesting /org_name/stable/+simple/internal_project_name/ of the mirrors, but then the mirrors are requesting /root/pypi/+simple/internal_project_name/ of the master. But they must at some point be requesting the correct URL, because the mirrors have all of the proper projects in org_name/stable, and you can install them all just fine using the mirrors.
On the master, the indexes are configured as follows (org_name/stable extends org_name/upstream extends root/pypi):
$ devpi index -v org_name/stable http://127.0.0.1/org_name/stable: type=stage bases=org_name/upstream volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=
$ devpi index -v org_name/upstream http://127.0.0.1/org_name/upstream: type=stage bases=root/pypi volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=*
The mirror/replica servers are configured like this:
devpi-server: serverdir: /data/devpi-server/server-data secretfile: /data/devpi-server/server-data/.secret unix-socket: /tmp/devpi-server.sock unix-socket-perms: "0666" threads: 16 replica-max-retries: 2 request-timeout: 5 root-passwd: ****************** theme: semantic-ui role: replica master-url: https://primary-devpi-server.example.com
Hope that additional information helps.
Thanks,
Nick
On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze <mail@florian-schulze.net> wrote:
On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
participants (2)
-
Florian Schulze
-
Nicholas Williams