[Catalog-sig] Mirror authenticity

"Martin v. Löwis" martin at v.loewis.de
Sat Mar 28 14:16:42 CET 2009

At the language summit, there was a request that PyPI mirrors
should get authenticated through some kind of digital signature
that is generated by the master server, and can be verified by
clients using the mirror. This addresses the threat of somebody
taking over a mirror and injecting false packages. Attacks
against the master are not addressed; authors should use the
existing PGP signing of packages to guarantee authenticity.

I propose the following structure to provide the ability
of verification at the clients (i.e. setuptools and friends).
At the server, the following URLs are available:

/serverkey   Public DSA key of the server, in the PEM format
              as generated by "openssl dsa -pubout" (i.e. RFC 3280
              SubjectPublicKeyInfo, with the algorithm
              This URL must *not* be mirrored, and clients must fetch
              the official serverkey from PyPI directly. The serverkey
              will change roughly once every year. Clients should cache
              the serverkey, and refetch it if it is
              a) more than one month old, or
              b) a signature failed to verify (which might be because
                 the serverkey has changed)
              DSA signature of the parallel URL /simple/<package>,
              in DER form, using SHA-1 with DSA (i.e. as a RFC 3279
              Dsa-Sig-Value, created by algorithm 1.2.840.10040.4.3)
              These URLs must be mirrored.

Signing the individual package pages is necessary because an
attacker might inject an additional download URL to a package,
tricking the client to download from a different location.
With the individual pages signed, signing the actual package
data is not necessary anymore, since each page contains md5 checksums
of the individual files.

Clients should only verify keys when they download from a mirror of
their (respective) central repository. Signing will cause
overhead (both for the server and the client), which is unnecessary
when the master server is contacted. In addition, the client might
be pointed to a master server which doesn't provide signatures
(and consequentially, doesn't provide mirrors, either).

Clients which do verify need to
1. compute SHA1 of the of the /simple page
2. compute the DSA signature of that hash
3. compare it with the /serversig data (byte-for-byte)
4. compute and verify md5 sums for all the files that they
    then download from mirror. Verification of files downloaded
    from other URLs is not possible with this approach.

I will try to provide a pure-Python implementation of
the page verification, based on AMK's python-crypto code.

Comments on this proposal are appreciated.


More information about the Catalog-SIG mailing list