Strange errors: "While handling ... project does not exist"
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com." We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/ That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch. However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following: [req78237] [Rtx14602] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : The project internal_project_name does not exist. What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server. Thanks, Nick
On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package. Regards, Florian Schulze
That request had already rolled off, so I had to search for a new one.
Here's this:
$ kubectl --namespace=devpi-server logs
devpi-server-primary-85d6484cb7-hgqpp | grep req8603
2020-01-03 13:22:19,560 INFO: [req8603] GET
/root/pypi/+simple/internal_project_name/
2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
:
I'm trying to troubleshoot that further, so I grepped for the URI, which
nothing should be requesting:
$ kubectl --namespace=devpi-server logs
devpi-server-primary-85d6484cb7-hgqpp | grep
"/root/pypi/+simple/internal_project_name/"
2020-01-02 18:10:53,164 INFO: [req82] GET
/root/pypi/+simple/internal_project_name/
2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
:
"172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET
/root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013
"http/https/https" "
primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com"
"80/443/443"
That IP address from which that request came is the IP address of one of
the mirror servers. When I grep the mirror logs for that internal project
name, there is nothing indicating that it is looking in root/pypi; only
org_name/stable:
$ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep
internal_project_name
2020-01-02 18:10:53,150 INFO: [req37] GET
/org_name/stable/+simple/internal_project_name/
"172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET
/org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-"
0.101 "http/https/https" "
devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443"
So our builds (locally and on Jenkins) are
requesting /org_name/stable/+simple/internal_project_name/ of the mirrors,
but then the mirrors are
requesting /root/pypi/+simple/internal_project_name/ of the master. But
they must at some point be requesting the correct URL, because the mirrors
have all of the proper projects in org_name/stable, and you can install
them all just fine using the mirrors.
On the master, the indexes are configured as follows (org_name/stable
extends org_name/upstream extends root/pypi):
$ devpi index -v org_name/stable
http://127.0.0.1/org_name/stable:
type=stage
bases=org_name/upstream
volatile=False
acl_upload=org_name
acl_toxresult_upload=org_name
mirror_whitelist=
$ devpi index -v org_name/upstream
http://127.0.0.1/org_name/upstream:
type=stage
bases=root/pypi
volatile=False
acl_upload=org_name
acl_toxresult_upload=org_name
mirror_whitelist=*
The mirror/replica servers are configured like this:
devpi-server:
serverdir: /data/devpi-server/server-data
secretfile: /data/devpi-server/server-data/.secret
unix-socket: /tmp/devpi-server.sock
unix-socket-perms: "0666"
threads: 16
replica-max-retries: 2
request-timeout: 5
root-passwd: ******************
theme: semantic-ui
role: replica
master-url: https://primary-devpi-server.example.com
Hope that additional information helps.
Thanks,
Nick
On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze
On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
This might help, too: Evidence that the mirrors ARE using org_name/stable to fetch new resources, in addition to (for some reason) looking in root/pypi: $ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "org_name/stable" ... "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:13:16 +0000] "GET /org_name/stable/+f/bc8/552112d1152f5/a_different_project-9.6.2-py2-none-any.whl HTTP/1.1" 200 1006276 "-" 0.000 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443" "172.27.32.145" (172.27.32.145) - [02/Jan/2020:18:13:17 +0000] "GET /org_name/stable/+f/bc8/552112d1152f5/a_different_project-9.6.2-py2-none-any.whl HTTP/1.1" 200 1006276 "-" 0.000 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443" ... Nick On Fri, Jan 3, 2020 at 7:48 AM Nicholas Williams < nicholas@nicholaswilliams.net> wrote:
That request had already rolled off, so I had to search for a new one. Here's this:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep req8603 2020-01-03 13:22:19,560 INFO: [req8603] GET /root/pypi/+simple/internal_project_name/ 2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... :
I'm trying to troubleshoot that further, so I grepped for the URI, which nothing should be requesting:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "/root/pypi/+simple/internal_project_name/" 2020-01-02 18:10:53,164 INFO: [req82] GET /root/pypi/+simple/internal_project_name/ 2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET /root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443"
That IP address from which that request came is the IP address of one of the mirror servers. When I grep the mirror logs for that internal project name, there is nothing indicating that it is looking in root/pypi; only org_name/stable:
$ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep internal_project_name 2020-01-02 18:10:53,150 INFO: [req37] GET /org_name/stable/+simple/internal_project_name/ "172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET /org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-" 0.101 "http/https/https" " devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443"
So our builds (locally and on Jenkins) are requesting /org_name/stable/+simple/internal_project_name/ of the mirrors, but then the mirrors are requesting /root/pypi/+simple/internal_project_name/ of the master. But they must at some point be requesting the correct URL, because the mirrors have all of the proper projects in org_name/stable, and you can install them all just fine using the mirrors.
On the master, the indexes are configured as follows (org_name/stable extends org_name/upstream extends root/pypi):
$ devpi index -v org_name/stable http://127.0.0.1/org_name/stable: type=stage bases=org_name/upstream volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=
$ devpi index -v org_name/upstream http://127.0.0.1/org_name/upstream: type=stage bases=root/pypi volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=*
The mirror/replica servers are configured like this:
devpi-server: serverdir: /data/devpi-server/server-data secretfile: /data/devpi-server/server-data/.secret unix-socket: /tmp/devpi-server.sock unix-socket-perms: "0666" threads: 16 replica-max-retries: 2 request-timeout: 5 root-passwd: ****************** theme: semantic-ui role: replica master-url: https://primary-devpi-server.example.com
Hope that additional information helps.
Thanks,
Nick
On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze
wrote: On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
Hi! I dug a bit deeper. This happens in devpi_server/views.py simple_list_project. We check the user agent header to do speed ups for installers like pip. When that doesn't match, additional info is added to the page. One of that is generated by get_mirror_whitelist_info. It has to check the existence of the project in the mirror to show whether the project blocks releases from the mirror or not. This is useful info, but triggers a request to the mirror, which is what you see. You should only see this when going to the simple page in a browser, installers like pip shouldn't trigger it. We only check for these strings in user agent though: distribute setuptools pip pex, so if that doesn't match get_mirror_whitelist_info will be triggered. I'm thinking about the implications of that, as it exposes internal package names to the mirror provider. Regards, Florian Schulze On 3 Jan 2020, at 14:48, Nicholas Williams wrote:
That request had already rolled off, so I had to search for a new one. Here's this:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep req8603 2020-01-03 13:22:19,560 INFO: [req8603] GET /root/pypi/+simple/internal_project_name/ 2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... :
I'm trying to troubleshoot that further, so I grepped for the URI, which nothing should be requesting:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "/root/pypi/+simple/internal_project_name/" 2020-01-02 18:10:53,164 INFO: [req82] GET /root/pypi/+simple/internal_project_name/ 2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET /root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443"
That IP address from which that request came is the IP address of one of the mirror servers. When I grep the mirror logs for that internal project name, there is nothing indicating that it is looking in root/pypi; only org_name/stable:
$ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep internal_project_name 2020-01-02 18:10:53,150 INFO: [req37] GET /org_name/stable/+simple/internal_project_name/ "172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET /org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-" 0.101 "http/https/https" " devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443"
So our builds (locally and on Jenkins) are requesting /org_name/stable/+simple/internal_project_name/ of the mirrors, but then the mirrors are requesting /root/pypi/+simple/internal_project_name/ of the master. But they must at some point be requesting the correct URL, because the mirrors have all of the proper projects in org_name/stable, and you can install them all just fine using the mirrors.
On the master, the indexes are configured as follows (org_name/stable extends org_name/upstream extends root/pypi):
$ devpi index -v org_name/stable http://127.0.0.1/org_name/stable: type=stage bases=org_name/upstream volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=
$ devpi index -v org_name/upstream http://127.0.0.1/org_name/upstream: type=stage bases=root/pypi volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=*
The mirror/replica servers are configured like this:
devpi-server: serverdir: /data/devpi-server/server-data secretfile: /data/devpi-server/server-data/.secret unix-socket: /tmp/devpi-server.sock unix-socket-perms: "0666" threads: 16 replica-max-retries: 2 request-timeout: 5 root-passwd: ****************** theme: semantic-ui role: replica master-url: https://primary-devpi-server.example.com
Hope that additional information helps.
Thanks,
Nick
On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze
wrote: On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
I see a few options here, which aren't mutually exclusive:
- Fix the user agent matching, because it appears broken somehow. The user
agent header in the logs is as follows:
pip/19.3.1
{\x22ci\x22:null,\x22cpu\x22:\x22x86_64\x22,\x22distro\x22:{\x22name\x22:\x22macOS\x22,\x22version\x22:\x2210.13.6\x22},\x22implementation\x22:{\x22name\x22:\x22CPython\x22,\x22version\x22:\x223.7.4\x22},\x22installer\x22:{\x22name\x22:\x22pip\x22,\x22version\x22:\x2219.3.1\x22},\x22openssl_version\x22:\x22OpenSSL
1.0.2s 28 May
2019\x22,\x22python\x22:\x223.7.4\x22,\x22setuptools_version\x22:\x2244.0.0\x22,\x22system\x22:{\x22name\x22:\x22Darwin\x22,\x22release\x22:\x2217.7.0\x22}}
But Devpi is still exhibiting this behavior, so somehow it's not matching
on that user agent header properly.
- Provide a setting to disable this behavior (a strong preference of mine),
partly to prevent these errors, but also because (as you said) it exposes
internal package names to the mirror provider.
- Provide a way to disable logging an error when a request is made to a
project's simple index that does not exist. I would disable it on the
master (because I don't care if the mirrors make this request) but leave it
enabled on the mirrors (because such errors would indicate a client error).
On Wed, Jan 8, 2020 at 4:04 AM Florian Schulze
Hi!
I dug a bit deeper.
This happens in devpi_server/views.py simple_list_project. We check the user agent header to do speed ups for installers like pip. When that doesn't match, additional info is added to the page. One of that is generated by get_mirror_whitelist_info. It has to check the existence of the project in the mirror to show whether the project blocks releases from the mirror or not. This is useful info, but triggers a request to the mirror, which is what you see. You should only see this when going to the simple page in a browser, installers like pip shouldn't trigger it. We only check for these strings in user agent though: distribute setuptools pip pex, so if that doesn't match get_mirror_whitelist_info will be triggered.
I'm thinking about the implications of that, as it exposes internal package names to the mirror provider.
Regards, Florian Schulze
On 3 Jan 2020, at 14:48, Nicholas Williams wrote:
That request had already rolled off, so I had to search for a new one. Here's this:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep req8603 2020-01-03 13:22:19,560 INFO: [req8603] GET /root/pypi/+simple/internal_project_name/ 2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... :
I'm trying to troubleshoot that further, so I grepped for the URI, which nothing should be requesting:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "/root/pypi/+simple/internal_project_name/" 2020-01-02 18:10:53,164 INFO: [req82] GET /root/pypi/+simple/internal_project_name/ 2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET /root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443"
That IP address from which that request came is the IP address of one of the mirror servers. When I grep the mirror logs for that internal project name, there is nothing indicating that it is looking in root/pypi; only org_name/stable:
$ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep internal_project_name 2020-01-02 18:10:53,150 INFO: [req37] GET /org_name/stable/+simple/internal_project_name/ "172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET /org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-" 0.101 "http/https/https" " devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443"
So our builds (locally and on Jenkins) are requesting /org_name/stable/+simple/internal_project_name/ of the mirrors, but then the mirrors are requesting /root/pypi/+simple/internal_project_name/ of the master. But they must at some point be requesting the correct URL, because the mirrors have all of the proper projects in org_name/stable, and you can install them all just fine using the mirrors.
On the master, the indexes are configured as follows (org_name/stable extends org_name/upstream extends root/pypi):
$ devpi index -v org_name/stable http://127.0.0.1/org_name/stable: type=stage bases=org_name/upstream volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=
$ devpi index -v org_name/upstream http://127.0.0.1/org_name/upstream: type=stage bases=root/pypi volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=*
The mirror/replica servers are configured like this:
devpi-server: serverdir: /data/devpi-server/server-data secretfile: /data/devpi-server/server-data/.secret unix-socket: /tmp/devpi-server.sock unix-socket-perms: "0666" threads: 16 replica-max-retries: 2 request-timeout: 5 root-passwd: ****************** theme: semantic-ui role: replica master-url: https://primary-devpi-server.example.com
Hope that additional information helps.
Thanks,
Nick
On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze
wrote: On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
To be clear on that first point, that user agent was in the Nginx logs, and Nginx is configured to pass all headers on to Devpi, so that should also be the user agent that Devpi gets. Also, I just re-confirmed that if I do something as simple as `pip install -U internal_project`, I get that error logged on the master. Also, if I go to https://devpi.example.com/eventbrite/stable/+simple/another_internal_project... in my browser, I get that error logged again (for the different project) on the master. So, for whatever reason, the user agent matching is definitely not working. Thanks, Nick On Wed, Jan 8, 2020 at 6:08 AM Nicholas Williams < nicholas@nicholaswilliams.net> wrote:
I see a few options here, which aren't mutually exclusive:
- Fix the user agent matching, because it appears broken somehow. The user agent header in the logs is as follows:
pip/19.3.1 {\x22ci\x22:null,\x22cpu\x22:\x22x86_64\x22,\x22distro\x22:{\x22name\x22:\x22macOS\x22,\x22version\x22:\x2210.13.6\x22},\x22implementation\x22:{\x22name\x22:\x22CPython\x22,\x22version\x22:\x223.7.4\x22},\x22installer\x22:{\x22name\x22:\x22pip\x22,\x22version\x22:\x2219.3.1\x22},\x22openssl_version\x22:\x22OpenSSL 1.0.2s 28 May 2019\x22,\x22python\x22:\x223.7.4\x22,\x22setuptools_version\x22:\x2244.0.0\x22,\x22system\x22:{\x22name\x22:\x22Darwin\x22,\x22release\x22:\x2217.7.0\x22}}
But Devpi is still exhibiting this behavior, so somehow it's not matching on that user agent header properly.
- Provide a setting to disable this behavior (a strong preference of mine), partly to prevent these errors, but also because (as you said) it exposes internal package names to the mirror provider.
- Provide a way to disable logging an error when a request is made to a project's simple index that does not exist. I would disable it on the master (because I don't care if the mirrors make this request) but leave it enabled on the mirrors (because such errors would indicate a client error).
On Wed, Jan 8, 2020 at 4:04 AM Florian Schulze
wrote: Hi!
I dug a bit deeper.
This happens in devpi_server/views.py simple_list_project. We check the user agent header to do speed ups for installers like pip. When that doesn't match, additional info is added to the page. One of that is generated by get_mirror_whitelist_info. It has to check the existence of the project in the mirror to show whether the project blocks releases from the mirror or not. This is useful info, but triggers a request to the mirror, which is what you see. You should only see this when going to the simple page in a browser, installers like pip shouldn't trigger it. We only check for these strings in user agent though: distribute setuptools pip pex, so if that doesn't match get_mirror_whitelist_info will be triggered.
I'm thinking about the implications of that, as it exposes internal package names to the mirror provider.
Regards, Florian Schulze
On 3 Jan 2020, at 14:48, Nicholas Williams wrote:
That request had already rolled off, so I had to search for a new one. Here's this:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep req8603 2020-01-03 13:22:19,560 INFO: [req8603] GET /root/pypi/+simple/internal_project_name/ 2020-01-03 13:22:19,631 ERROR: [req8603] [Rtx14627] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... :
I'm trying to troubleshoot that further, so I grepped for the URI, which nothing should be requesting:
$ kubectl --namespace=devpi-server logs devpi-server-primary-85d6484cb7-hgqpp | grep "/root/pypi/+simple/internal_project_name/" 2020-01-02 18:10:53,164 INFO: [req82] GET /root/pypi/+simple/internal_project_name/ 2020-01-02 18:10:53,170 ERROR: [req82] [Rtx14604] while handling https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_... : "172.27.23.69" (172.27.23.69) - [02/Jan/2020:18:10:53 +0000] "GET /root/pypi/+simple/internal_project_name/ HTTP/1.1" 404 1037 "-" 0.013 "http/https/https" " primary-devpi-server.example.com/primary-devpi-server.example.com/primary-devpi-server.example.com" "80/443/443"
That IP address from which that request came is the IP address of one of the mirror servers. When I grep the mirror logs for that internal project name, there is nothing indicating that it is looking in root/pypi; only org_name/stable:
$ kubectl --namespace=devpi-server logs devpi-server-replica-0 | grep internal_project_name 2020-01-02 18:10:53,150 INFO: [req37] GET /org_name/stable/+simple/internal_project_name/ "172.27.36.103" (172.27.36.103) - [02/Jan/2020:18:10:53 +0000] "GET /org_name/stable/+simple/internal_project_name/ HTTP/1.1" 200 1878 "-" 0.101 "http/https/https" " devpi.example.com/devpi.example.com/devpi.example.com" "80/443/443"
So our builds (locally and on Jenkins) are requesting /org_name/stable/+simple/internal_project_name/ of the mirrors, but then the mirrors are requesting /root/pypi/+simple/internal_project_name/ of the master. But they must at some point be requesting the correct URL, because the mirrors have all of the proper projects in org_name/stable, and you can install them all just fine using the mirrors.
On the master, the indexes are configured as follows (org_name/stable extends org_name/upstream extends root/pypi):
$ devpi index -v org_name/stable http://127.0.0.1/org_name/stable: type=stage bases=org_name/upstream volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=
$ devpi index -v org_name/upstream http://127.0.0.1/org_name/upstream: type=stage bases=root/pypi volatile=False acl_upload=org_name acl_toxresult_upload=org_name mirror_whitelist=*
The mirror/replica servers are configured like this:
devpi-server: serverdir: /data/devpi-server/server-data secretfile: /data/devpi-server/server-data/.secret unix-socket: /tmp/devpi-server.sock unix-socket-perms: "0666" threads: 16 replica-max-retries: 2 request-timeout: 5 root-passwd: ****************** theme: semantic-ui role: replica master-url: https://primary-devpi-server.example.com
Hope that additional information helps.
Thanks,
Nick
On Fri, Jan 3, 2020 at 3:32 AM Florian Schulze
wrote: On 2 Jan 2020, at 18:19, Nicholas Williams wrote:
We're starting to roll out our new Devpi installation to projects now that we've tested it extensively. We have one master and three mirrors. For the purposes of this email, the mirrors are load-balanced behind the domain name "devpi.example.com," and the master is available at " primary-devpi-server.example.com."
We configured our Pipenv and Pip installations to use this index URL: https://devpi.example.com/org_name/stable/+simple/
That URL will use the mirrors. The "org_name/stable" index extends root/pypi and also includes our internal projects. Everything appears to be working on the client. No errors, installs are going off without a hitch.
However, the Devpi server on the master machine (not the mirrors) logs a ton of errors like the following:
[req78237] [Rtx14602] while handling
https://primary-devpi-server.example.com/root/pypi/+simple/internal_project_...
: The project internal_project_name does not exist.
What is the cause of these errors? Is this something that should concern us, or something that we can ignore? "internal_project_name" installs just fine, but for some reason yields those errors on the master server.
Is there more log output with the same request number ("req78237" in this case)? Without that it is hard to say. Looking at the code this should only happen if root/pypi is accessed directly. What is the setting of ``mirror_whitelist`` in "org_name/stable"? I tried with an empty whitelist and with * and in both cases I wasn't able to provoke the error for a private package.
Regards, Florian Schulze
participants (2)
-
Florian Schulze
-
Nicholas Williams