Currently, PyPI allows you to upload a GPG signature along with your package
file as well as associate a GPG Short ID with your user. Theoretically this
allows end users to not trust PyPI and instead validate end to end signatures
from the original author.
I've written  previously about package signing, and about a number of common
suggestions for achieving it that don't actually do much, if anything, to
increase to the security of things. The current implementation on PyPI falls
into such a trap. The main problem with GPG and package signing is that a GPG
key provides some guarantees (ignoring issues with the concept of a WOT) about
the *identity* of the person possessing the key, however it provides no
mechanism for providing any guarantees about what *capabilities* should be
granted to that person. More concretely, while you can use GPG as is to verify
that yes, "Donald Stufft" signed a particular package, you cannot use it to
determine if "Donald Stufft" is *allowed* to sign for that package, a valid
signature from me on the requests project should be just as invalid as an
invalid signature from anyone on the requests project. The only namespacing
provided by GPG itself is "trusted key" vs "not trusted key".
PyPI offers a work around for this in the form of allowing users to associate
their GPG short ID with their user profile. However this is not actually very
useful because it doesn't really provide much benefit overall. The goal of
signing a package on PyPI is generally to allow you to safely download the file
without trusting PyPI, but if you need to trust PyPI to determine what key is
allowed to sign a package, then you've not really added much in the way of
additional assurances, you've just added another possible point of failure.
Beyond the inherent issues with attempting to use GPG support for anything
useful on PyPI there are also a number of implementation specific issues with
this support. Currently you *must* use a GPG Short ID with PyPI, however the
GPG Short IDs are not actually secure and can be pretty easily brute forced to
have a collision, which means that people can make keys that come out to the
same short ID as one that an author has in their profile. In addition,
uploading a signature to PyPI is not actually validated upon upload, allowing
people to upload a signature that doesn't actually validate, causing a persist
failure mode (though nobody validates our signatures, so nobody ever runs into
On top of all of that, I believe that there will never be a case where a tool
like pip supports these GPG signatures . Even if you wiped away all of the
above problems, GPG is still a complex standard without great support for
building tooling around it. The most reasonable way of implementing support for
that would be to ship a copy of the gpg binary around for different platforms
and shell out to it. However GPG is GPL licensed which means that's not
something we could actually do, and even if it were, shipping binaries is not
generally a reasonable thing to do for pip and anything besides pure Python is
a no go.
I am aware of a single tool anywhere that actively supports verifying the
signatures that people upload to PyPI, and that is Debian's uscan program. Even
in that case the people writing the Debian watch file have to hardcode in a
signing key into it and in my experience, when faced with a validation error
it's not unusual for Debian to simply disable signature checking for that
project and/or just blindly update the key to whatever the new key is.
All in all, I think that there is not a whole lot of point to having this
feature in PyPI, it is predicated a bunch of invalid assumptions (as detailed
above) and I do not believe end users are actually even using the keys that
are being uploaded. Last time I looked, pip, easy_install, and bandersnatch
represented something like 99% of all download traffic on PyPI and none of
those will do anything with the .asc files being uploaded to PyPI (other than
bandersnatch just blindly mirroring it). When looking at the number of projects
actively using this feature on PyPI, I can see that 27931/591919 files on PyPI
have the ``has_signature`` database field set to true, or roughly 4% of all
files on PyPI, which roughly holds up when you look at the number of distinct
projects that have ever uploaded a signature as well (3559/80429).
Thus, I would like to remove this feature from PyPI (but not from PEP 503, if
other repositories want to continue to support it they are free to). Doing this
would allow simplifying code we have in Warehouse anyplace we touch uploaded
files (since we almost always end up needing to branch into special behavior
for files ending with .asc). It will allow us to reduce the number of concepts
in the UI (what is a pgp signature? What do I do with it? etc) without simply
hiding a feature (which is likely to cause confusion, why do you support it if
you won't show it etc). I think it will also make releasing slightly easier for
developers, since I personally know a number of authors on PyPI who don't
really believe there is any value in signing their packages on PyPI, but they
do it anyways because of a vague notion that they should do it.
If we do it, an open question would be what we do with all of the *existing*
signatures on PyPI. We could just leave them in place and stop accepting new
signatures, though that still means we end up needing to branch on .asc
anyplace we handle files because they'll still be a valid code path. Another
option is to just simply get rid of them and act as if nobody ever uploaded
them in the first place, which is my preferred option.
What do folks think? Would anyone be particularly against getting rid of the
GPG support in PyPI?
 When we do implement package signing in pip, it will almost certainly be
via TUF, most likely using ed25519 signatures but perhaps using RSA.
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
This is fairly broken - it doesn't handle platlib vs purelib (see pip
PR 3450), doesn't handle data files, or any other layout. Donald says
pip uses it to maintain the _vendor subtrees only, which doesn't seem
like a particularly strong use case.
Certainly the help description for it is misleading - since what we're
actually doing is copying only a subset of what the package installed
- so at a minimum we need to be much clearer about what it does.
But, I think it would be better to deprecate it and remove it... so
I'm pinging here to see if anyone can explain a sensible use case for
it in the context of pip :)
Robert Collins <rbtcollins(a)hpe.com>
HP Converged Cloud
I just received a bug report for enum34 for python3 code in a test file.
Further research revealed that that file should not be included in the
distribution, and is not included in the .tar nor .zip bundles, only the
Is this a known/fixed bug? If not, where do I report it?
I'm developing an application that uses pyparsing and after upgrading
setuptools to the newest version I noticed some tests failing. In my
main parser module I define an alias for the ParseBaseException which I
then use in other parts of the application to catch exceptions:
# definition of the ParseException
ParseException = pyparsing.ParseBaseException
# importing this alias in another module
from ...filterreader.parser import ParseException
Now my tests were failing because the ParseException was never actually
caught. Some investigation by comparing the id() of the objects showed
that the ParseException alias was no longer the same object as
pyparsing.ParseBaseException. This was because the module "pyparsing" at
the time of the alias definition was not the same "pyparsing" module
which is later used for parsing. Looking at sys.module I can see that I
have two pyparsing modules:
pyparsing: <module 'pyparsing' from
pkg_resources.extern.pyparsing: <module 'pyparsing' from
At the time of the alias definition id(pyparsing) is equal to the id()
of pkg_resources.extern.pyparsing. When I later import pyparsing I get
the other module. This whole problem only happens when I use the
application packaged by cx_Freeze, so maybe some kind of race condition
happens when importing from a ZIP file. I'm using 64 bit Python 3.4.4 on
The first version of setuptools where I can see this problem is 20.2,
until 20.1 everything is fine. Looking at the source I can see that
starting with 20.2 setuptools also includes its own pyparsing copy, so
most likely that change is related to my problem.
Is there a simple way in which I can guarantee that there will only ever
be a single "pyparsing" module in my application? Of course I could just
stop using the alias and use the pyparsing exceptions directly, but I
feel a bit uneasy when a module just changes its identity at some point
I don't know what happened recently. Usually I install setuptools by a script using the ez_setup.py script.
Recently I get an error:
Downloading https://pypi.python.org/packages/source/s/setuptools/setuptools-21.0.0.zipT… (most recent call last): File "downloads/ez_setup.py", line 415, in <module> sys.exit(main()) File "downloads/ez_setup.py", line 411, in main archive = download_setuptools(**_download_args(options)) File "downloads/ez_setup.py", line 336, in download_setuptools downloader(url, saveto) File "downloads/ez_setup.py", line 287, in download_file_insecure src = urlopen(url) File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.4/urllib/request.py", line 469, in open response = meth(req, response) File "/usr/lib/python3.4/urllib/request.py", line 579, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python3.4/urllib/request.py", line 507, in error return self._call_chain(*args) File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain result = func(*args) File "/usr/lib/python3.4/urllib/request.py", line 587, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp)urllib.error.HTTPError: HTTP Error 404: Not Found
For now I can copy the package from an old virtualenv, but I'd appreciate a better solution/advise.
I've fallen seriously behind in trying to admin PyPI by myself, and I'm
calling for someone to help. Generally this means helping people reset
their email address for account recovery, or trying to contact owners of
packages to facilitate ownership changes. The *ahem* "tools" available
aren't the best, and will require privileged access to a system to do some
of this work, so you'll need to be someone I personally trust, or at least
vouched for by someone I personally trust in the Python community.
Just a heads up that due to hitting query timeouts when attempting to lookup serials I changed the way serials work and are queried on PyPI. This should have no visible changes to end users but keep an eye out for serials that don't look correct, particularly via bandersnatch failures. The good news is the new stuff is significantly faster (125x speed up on my local machine with no lock contention-- but more like 1250x given that it was timing out at 30s under load on PyPI) so API calls like XMLRLC list_packages_with_serial should be much faster now.
Anyways. Let me know if anyone notices any broken bandersnatches.
Sent from my iPhone
On Wed, May 4, 2016 at 4:02 PM, Paul R. Tagliamonte
> Hey all,
> For those who don't know, Trove classifiers are used by the Python
> world to talk about what is contained in the Python package. Stuff
> like saying "It's under the MIT/Expat license!" or "It's beta!".
> I was looking at the tags, and I saw one that made me "wat" a bit.
>> License :: OSI Approved :: GNU Free Documentation License (FDL)
> AFAIK the GFDL is *not* OSI approved, both due to it not being a
> software license, as well as I'm sure the invariant clauses being an
> Has anyone come across this yet? Anyone have objections to me trying
> to clean up the Trove list?
Cleaning the list is going to be easy on the Python.org side,
especially since a new Pypi site is in the making. 
The harder or impossible part would have be to clean up the 1000+ of
packages using this faulty classifier....
But there is really only three of these  and all of them look
either pretty old or abandoned and none has its packages effectively
hosted or distributed on Pypi.
Markdown READMEs are becoming increasingly ubiquitous for many projects.
GitHub, GitLab, Bitbucket, among others, happily detect .md readme files
and render them in their web interfaces. rST is nice, but is generally
overkill for single-page documents (as opposed to more intricate
documentation). To get something done sooner, rather than later, I'd prefer
to come up with a two-phase solution, one narrow and "opt-in" (status-quo
for all existing packages unless the maintainer does something) for quick
implementation with hopefully minimal pushback. The other, later,
not-proposed-here, could be more feature-rich/heuristic.
So, to get Markdown supported in some form, here's some talking points to
* Add a "long_description_filename" to setup (suggested by @msabramo/GH
), which does the usual boilerplate "[codecs.]open(x, 'r',
encoding=y).read()". To determine the format look at an additional
"long_description_content_type" field (if provided), otherwise look at the
file extension and assume/require UTF-8.
* As an alternative, if there is no long_description, and the fall-back
to README.rst fails, look for README.md and grab that. Such a strategy
wouldn't be fully opt-in, however.
* Markdown (just like reStructuredText) allows arbitrary HTML to be added.
The renderer must then be upstream of the (existing) clean (with bleach)
* [Optional]: Use common extensions provided by the PyPI/Markdown library
to support GFM/SO stuff: fenced_code, smart_strong, nl2br
Amaral Lab, Northwestern University