Currently in the packaging space, we have a number of avenues for communication, which are:
- Other project specific mailing lists
- Various issue trackers spread across multiple platforms.
- Probably more places I’m not remembering.
The result of this is that all discussion ends up being super fractured amongst the various places. Sometimes that is exactly what you want (for instance, someone who is working on the wheel specs probably doesn’t care about deprecation policy and internal module renaming in pip) and sometimes that ends up being the opposite of what you want (for instance, when you’re describing something that touches PyPI, setuptools, flit, pip, etc all at once).
Theoretically the idea is that distutils-sig is where cross project reaching stuff goes, IRC/gitter is where real time discussion goes, and the various project mailing lists and issue trackers are where the project specific bits go. The problem is that often times doesn’t actually happen in practice except for the largest and most obvious of changes.
I think our current “communications stack” kind of sucks, and I’d love to figure out a better way for us to handle this that solves the sort of weird “independent but related” set of projects we have here.
From my POV, a list of our major problems are:
* Discussion gets fractured across a variety of platforms and locations, which can make it difficult to actually keep up with what’s going on but also to know how to loop in someone relevant if their input would be valuable. You have to more or less simultaneously know someone’s email, Github username, IRC nick, bitbucket username, etc to be able to bring threads of discussion to people’s attention.
* It’s not always clear to users where a discussion should go, often times they’ll come to one location and need to get redirected to another location. If any discussion did happen in the incorrect location, it tends to need to get restarted in the new location (and depending on the specific platform, it may be impossible to actually redirect everyone over to the proper location, so you again, end up fractured with the discussion happening in two places).
* A lot of the technology in this stack is particularly old, and lacks a lot of the modern day affordances that newer things have. An example is being able to edit a discussion post to fix typos that can hinder the ability of others to actually understand whats being talked about. In your typical mailing list or IRC there’s no mechanism by which you can edit an already sent message, so your only option is to either let the problem ride and hope it doesn’t trip up too many people, or send an additional message to correct the error. However these show up as additional, later messages which someone might not even see until they’ve already been thoroughly confused by the first message (since people tend to read email/IRC in a linear fashion).
- There is a lot of things in this one, other things are things like being able to control in a more fine grained manner what email you’re going to get.
- Holy crap, formatting and typography to make things actually readable and not a big block of plaintext.
* We don’t have a clear way for users to get help, leaving users to treat random issues, discussion areas, etc as a support forum, rather than some place that’s actually optimized for that. Some of this ties back into some of the earlier things too, where it’s hard to actually redirect discussions
These aren’t *new* problems, and often times the existing members of a community are the least effected becasue they’ve already spent effort learning the ins and outs and also curating a (typically custom) workflow that they’ve grown accustomed too. The problem with that is that often times that means that new users are left out, and the community gets smaller and smaller as time goes on as people leave and aren’t replaced with new blood, because they’re driven off but the issues with the stack.
A big part of the place this is coming from, is me sitting back and realizing that I tend to be drawn towards pulling discussions into Github issues rather than onto the varying mailing lists, not because that’s always the most appropriate place for it, but because it’s the least painful place in terms of features and functionality. I figure if I’m doing that, when I already have a significant investment in setting up tooling and being involved here, that others (and particularly new users) are likely feeling the same way.
PyPi does not allow duplicate file names -- this makes lots of sense,
because you really don't want people to go to PyPi one day and grab a file,
and then go there another day, grab a file with exactly the same name, and
have it be a different file.
We are all too human, and make mistakes when doing a release. All to often
someone pushed a broken file up to PyPi, often realizes it pretty quickly
-- before anyone has a chance to even download it (or only the dev team as,
In fact, I was in a sprint last summer, and we decided to push our package
up to PyPi -- granted, we were all careless amateurish noobs, but we ended
up making I think 4! minor version bumps because we had done something
stupid in the sdist.
Also -- the latest numpy release got caught in this, too:
* We ran into a problem with pipy not allowing reuse of filenames and a
resulting proliferation of *.*.*.postN releases. Not only were the names
getting out of hand, some packages were unable to work with the postN
So -- I propose that PyPi allow projects to replace existing files if they
REALLY REALLY want to.
You should have to jump through all sorts of hoops, and make it really
clear that it is a BAD IDEA in the general case, but it'd be good to have
it be possible.
After all -- PyPi does not take on responsibility for anything else about
what's in those packages, and Python itself is all about "we're all
consenting adults here"
I suppose we could even put in some heuristics about how long the file as
been there, how many times it's been downloaded, etc.
Just a thought.....I really hate systems that don't let me roll back
mistakes, even when I discover them almost immediately...
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
I've got two projects: mynamespace.myprojectA and mynamespace.myprojectB
myprojectB depends on myprojectA. I'm using setuptools 0.6c8 to manage both
Both projects are registered using 'setup develop'. Both projects are
accessible from an interactive interpreter:
PS C:\Users\me\projects> python
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)]
Type "help", "copyright", "credits" or "license" for more information.
>>> import mynamespace.myprojectA
>>> import mynamespace.myprojectB
>>> from mynamespace.myprojectA import mymoduleZ
However, when I run 'setup test' in myprojectB, the tests fail with
File ".mymoduleZ.py", line NNN, in [some context]
from mynamespace.myprojectA.mymoduleZ import MyClassC
ImportError: No module named myprojectA.mymoduleZ
In setup.py, the test_suite is nose.collector.
I searched and couldn't find anyone else with this problem. Is this a
supported configuration? Is there something I can do to make tests work
with interdependent projects with the same root namespace?
If there's not something obvious I should be doing differently, I'm happy to
put together a minimal test case that reproduces the problem. Any
suggestions are appreciated.
Jason R. Coombs
I am a research programmer at the NYU School of Engineering. My colleagues
(Trishank Kuppusamy and Justin Cappos) and I are requesting community
feedback on our proposal, "Surviving a Compromise of PyPI." The two-stage
proposal can be reviewed online at:
Summary of the Proposal:
"Surviving a Compromise of PyPI" proposes how the Python Package Index
(PyPI) can be amended to better protect end users from altered or malicious
packages, and to minimize the extent of PyPI compromises against affected
users. The proposed integration allows package managers such as pip to be
more secure against various types of security attacks on PyPI and defend
end users from attackers responding to package requests. Specifically,
these PEPs describe how PyPI processes should be adapted to generate and
incorporate repository metadata, which are signed text files that describe
the packages and metadata available on PyPI. Package managers request
(along with the packages) the metadata on PyPI to verify the authenticity
of packages before they are installed. The changes to PyPI and tools will
be minimal by leveraging a library, The Update Framework
<https://github.com/theupdateframework/tuf>, that generates and
transparently validates the relevant metadata.
The first stage of the proposal (PEP 458
<http://legacy.python.org/dev/peps/pep-0458/>) uses a basic security model
that supports verification of PyPI packages signed with cryptographic keys
stored on PyPI, requires no action from developers and end users, and
protects against malicious CDNs and public mirrors. To support continuous
delivery of uploaded packages, PyPI administrators sign for uploaded
packages with an online key stored on PyPI infrastructure. This level of
security prevents packages from being accidentally or deliberately tampered
with by a mirror or a CDN because the mirror or CDN will not have any of
the keys required to sign for projects.
The second stage of the proposal (PEP 480
<http://legacy.python.org/dev/peps/pep-0480/>) is an extension to the basic
security model (discussed in PEP 458) that supports end-to-end verification
of signed packages. End-to-end signing allows both PyPI and developers to
sign for the packages that are downloaded by end users. If the PyPI
infrastructure were to be compromised, attackers would be unable to serve
malicious versions of these packages without access to the project's
developer key. As in PEP 458, no additional action is required by end
users. However, PyPI administrators will need to periodically (perhaps
every few months) sign metadata with an offline key. PEP 480 also proposes
an easy-to-use key management solution for developers, how to interface
with a potential build farm on PyPI infrastructure, and discusses the
security benefits of end-to-end signing. The second stage of the proposal
simultaneously supports real-time project registration and developer
signatures, and when configured to maximize security on PyPI, less than 1%
of end users will be at risk even if an attacker controls PyPI and goes
undetected for a month.
We thank Nick Coghlan and Donald Stufft for their valuable contributions,
and Giovanni Bajo and Anatoly Techtonik for their feedback.
PEP 458 & 480 authors.
As a new Twine maintainer I've been running into questions like:
* Now that Warehouse doesn't use "register" anymore, can we deprecate it from distutils, setuptools, and twine? Are any other package indexes or upload tools using it? https://github.com/pypa/twine/issues/311
* It would be nice if Twine could depend on a package index providing an HTTP 201 response in response to a successful upload, and fail on 200 (a response some non-package-index servers will give to an arbitrary POST request).
I do not see specifications to guide me here, e.g., in the official guidance on hosting one's own package index https://packaging.python.org/guides/hosting-your-own-index/ . PEP 301 was long enough ago that it's due an update, and PEP 503 only concerns browsing and download, not upload.
I suggest that I write a PEP specifying an API for uploading to a Python package index. This PEP would partially supersede PEP 301 and would document the Warehouse reference implementation. I would write it in collaboration with the Warehouse maintainers who will develop the reference implementation per pypa/warehouse/issues/284 and maybe add a header referring to compliance with this new standard. And I would consult with the maintainers of packaging and distribution tools such as zest.releaser, flit, poetry, devpi, pypiserver, etc.
Per Nick Coghlan's formulation, my specific goal here would be close to:
> Documenting what the current upload API between twine & warehouse actually is, similar to the way PEP 503 focused on describing the status quo, without making any changes to it. That way, other servers (like devpi) and other upload clients have the info they need to help ensure interoperability.
Since Warehouse is trying to redo its various APIs in the next several months, I think it might be more useful to document and work with the new upload API, but I'm open to feedback on this.
After a little conversation here on distutils-sig, I believe my steps would be:
1. start a very early PEP draft with lots of To Be Determined blanks, submit as a PR to the python/peps repo, and share it with distutils-sig
2. ping maintainers of related tools
3. discuss with others at the packaging sprints https://wiki.python.org/psf/PackagingSprints next week
4. revise and get consensus, preferably mostly on this list
5. finalize PEP and get PEP accepted by BDFL-Delegate
6. coordinate with PyPA, maintainers of `distutils`, maintainers of packaging and distribution tools, and documentation maintainers to implement PEP compliance
Thoughts are welcome. I originally posted this at https://github.com/pypa/packaging-problems/issues/128 .
I am fairly sure if you give the PyPA that suggestion, they will just deflate at the thought of the workload. Besides, we already offer private repos for free, several ways ranging from devpi to python -m SimpleHTTPServer in a specially created directory.
From: Python-ideas <python-ideas-bounces+tritium-list=sdamon.com(a)python.org> On Behalf Of Nick Humrich
Sent: Wednesday, April 4, 2018 12:26 PM
Subject: [Python-ideas] Pypi private repo's
I am sure this has been discussed before, and this might not even be the best place for this discussion, but I just wanted to make sure this has been thought about.
What if pypi.org <http://pypi.org> supported private repos at a cost, similar to npm?
This would be able to help support the cost of pypi, and hopefully make it better/more reliable, thus in turn improving the python community.
If this discussion should happen somewhere else, let me know.
The weekend of October 27-28, simultaneously in London, UK and New York
City, USA, Bloomberg will host a Python packaging and distribution tools
event. Please mark your calendars!
If you live in North America or Europe and would need assistance to
attend this as a mentor/helper, watch for more details in July.
If you live outside of the US or UK and would need an invitation letter
to get a visa to travel to one of these sprints, please write to Kevin
P. Fleming at Bloomberg, kpfleming AT bloomberg DOT net, and he'll start
setting you up.
Thanks to Bloomberg for their generosity. They're already a Platinum PSF
sponsor, and they'll host this, pay for a maintainers'/mentors' dinner
the night before, provide clusters of cloud virtual machines for the
attendees to use, and book and pay for some contributors' lodging and
This'll be an opportunity to advance Python packaging/distro tools,
teach new contributors (including many Bloomberg employees), and yeah,
if you want to get to know Bloomberg for career reasons, that too. :)
We hope mentors can arrive Thursday night 25 Oct, do prep, setup, and
dinner on Friday, then participate Sat-Sun, then leave Sunday evening or
We'll be putting more details on these lists (distutils-sig and
pypa-dev) and at https://wiki.python.org/psf/PackagingSprints .
Thanks to Bloomberg folks Mario Corchero and Henry Kleynhans in London
and Kevin P. Fleming in New York City for coordinating this, and thanks
especially to Mario and to Paul Ganssle for suggesting it!
Today, LWN published my new article "A new package index for Python".
https://lwn.net/Articles/751458/ In it, I discuss security, policy, UX
and developer experience changes in the 15+ years since PyPI's founding,
new features (and deprecated old features) in Warehouse, and future
plans. Plus: screenshots!
If you aren't already an LWN subscriber, you can use this subscriber
link for the next week to read the article despite the LWN paywall.
This summary should help occasional Python programmers -- and frequent
Pythonists who don't follow packaging/distro discussions closely --
understand why a new application is necessary, what's new, what features
are going away, and what to expect in the near future. I also hope it
catches the attention of downstreams that ought to migrate.
Warehouse project manager
pip is currently not well integrated on Linux: it conflicts with the
system package manager like apt or rpm. When pip writes files into
/usr, it can replace files written by the system package manager and
so create different kind of issues. For example, if you check the
system integry, you will likely see that some Python files have been
I would like to open a discussion to see how each Linux vendor handles
the issue, and see if a common solution can be designed.
Debian uses /usr for apt-get install and /usr/local for distutils and
Fedora decided to change pip to install files into /usr/local by
default, instead of /usr, so "sudo pip install" doesn't replace files
installed by dnf (Fedora package manager):
It gives you 3 main places to install Python code: /usr (managed by
dnf), /usr/local (managed by sudo pip), $HOME/.local (managed by pip
Would it make sense to make the Fedora/Debian change upstream? At
least, give an opt-in option for Linux vendors to use /usr/local?
I propose to make the change upstream because there are still issues,
and I don't want to be alone to have to fix them :-) It should be
easier if we agree on a filesystem layout and an implementation, so
we can collaborate on issues!
Issues with the current Fedora implementation:
(1) When Python is embedded in an application, there is an issue with
the current heuristic to decide if /usr/local should be added to
(2) On Fedora, "sudo pip install -U" currently removes old code from
/usr and install the new one in /usr/local. We should leave /usr
unchanged, since only dnf should touch /usr.
The implementation is made of a single patch on the Python site module:
There are two issues related to the "sudo pip" change, but they
already exist when pip is installed in $HOME/.local:
(3) Priority issue between PATH and PYTHONPATH directories.
When the user runs "pip", the pip binary may come from /usr,
/usr/local or $HOME/.local/bin, but the Python pip module ("import
pip") may come from a different path. Which binary and which module
should be used?
Obvisouly, users can replace these two environment variables...
(4) Related to (3). Running "pip" may run pip binary of one pip
version, but pick the "pip" Python module of another pip version.
For example, pip9 binary from /usr/bin/pip, but pip10 module from /usr/local.
Fedora works around issue (4) with a downstream patch on pip:
I don't well well how Linux distributions handle the issue with "sudo
pip". So don't hesitate to correct me if I'm wrong :-) My goal is
just to start a discussion about a common "upstream" solution.
I've been looking for a way to ensure that certain modules don't end
up in a wheel, while the rest of the package they reside in does. If I
only cared about sdist, I could add a MANIFEST.in, in which I'd
exclude those specific files, however, unfortunately, MANIFEST.in has
no effect on bdists (at least of the wheel kind).
The use case is that our application auto-generates a parser and lexer
with ply, and that parser might not work with different versions of
ply. Since we don't have a whole lot of control over what version
users have installed in their environments, we'd like to generate
those modules in the target environment.
I took a deep dive into distutils and setuptools, and as far as I can
see, any Python modules residing inside a package listed in the
packages argument to setup() are included in the distribution
unconditionally. Searching this mailing list only reveals a short
thread from nine years ago  without any solution...
For now, the easiest hacky solution for me is to add a couple of
os.remove calls to setup.py, but I'm not a big fan of setup.py messing
with the source tree.
And as a follow-up question, is there any post-installation hook that
we could use to trigger regeneration of those files?