Mailman 3 September 2018 - Distutils-SIG

Improving Communication
by Donald Stufft 30 Jul '20

30 Jul '20

Currently in the packaging space, we have a number of avenues for communication, which are: - distutils-sig - pypa-dev - virtualenv-users - Other project specific mailing lists - IRC - gitter - Various issue trackers spread across multiple platforms. - Probably more places I’m not remembering. The result of this is that all discussion ends up being super fractured amongst the various places. Sometimes that is exactly what you want (for instance, someone who is working on the wheel specs probably doesn’t care about deprecation policy and internal module renaming in pip) and sometimes that ends up being the opposite of what you want (for instance, when you’re describing something that touches PyPI, setuptools, flit, pip, etc all at once). Theoretically the idea is that distutils-sig is where cross project reaching stuff goes, IRC/gitter is where real time discussion goes, and the various project mailing lists and issue trackers are where the project specific bits go. The problem is that often times doesn’t actually happen in practice except for the largest and most obvious of changes. I think our current “communications stack” kind of sucks, and I’d love to figure out a better way for us to handle this that solves the sort of weird “independent but related” set of projects we have here. From my POV, a list of our major problems are: * Discussion gets fractured across a variety of platforms and locations, which can make it difficult to actually keep up with what’s going on but also to know how to loop in someone relevant if their input would be valuable. You have to more or less simultaneously know someone’s email, Github username, IRC nick, bitbucket username, etc to be able to bring threads of discussion to people’s attention. * It’s not always clear to users where a discussion should go, often times they’ll come to one location and need to get redirected to another location. If any discussion did happen in the incorrect location, it tends to need to get restarted in the new location (and depending on the specific platform, it may be impossible to actually redirect everyone over to the proper location, so you again, end up fractured with the discussion happening in two places). * A lot of the technology in this stack is particularly old, and lacks a lot of the modern day affordances that newer things have. An example is being able to edit a discussion post to fix typos that can hinder the ability of others to actually understand whats being talked about. In your typical mailing list or IRC there’s no mechanism by which you can edit an already sent message, so your only option is to either let the problem ride and hope it doesn’t trip up too many people, or send an additional message to correct the error. However these show up as additional, later messages which someone might not even see until they’ve already been thoroughly confused by the first message (since people tend to read email/IRC in a linear fashion). - There is a lot of things in this one, other things are things like being able to control in a more fine grained manner what email you’re going to get. - Holy crap, formatting and typography to make things actually readable and not a big block of plaintext. * We don’t have a clear way for users to get help, leaving users to treat random issues, discussion areas, etc as a support forum, rather than some place that’s actually optimized for that. Some of this ties back into some of the earlier things too, where it’s hard to actually redirect discussions These aren’t *new* problems, and often times the existing members of a community are the least effected becasue they’ve already spent effort learning the ins and outs and also curating a (typically custom) workflow that they’ve grown accustomed too. The problem with that is that often times that means that new users are left out, and the community gets smaller and smaller as time goes on as people leave and aren’t replaced with new blood, because they’re driven off but the issues with the stack. A big part of the place this is coming from, is me sitting back and realizing that I tend to be drawn towards pulling discussions into Github issues rather than onto the varying mailing lists, not because that’s always the most appropriate place for it, but because it’s the least painful place in terms of features and functionality. I figure if I’m doing that, when I already have a significant investment in setting up tooling and being involved here, that others (and particularly new users) are likely feeling the same way. - Donald

9 12

PyPi not allowing duplicate filenames
by Chris Barker 20 Apr '20

20 Apr '20

Hi all, PyPi does not allow duplicate file names -- this makes lots of sense, because you really don't want people to go to PyPi one day and grab a file, and then go there another day, grab a file with exactly the same name, and have it be a different file. However.... We are all too human, and make mistakes when doing a release. All to often someone pushed a broken file up to PyPi, often realizes it pretty quickly -- before anyone has a chance to even download it (or only the dev team as, for testing...). In fact, I was in a sprint last summer, and we decided to push our package up to PyPi -- granted, we were all careless amateurish noobs, but we ended up making I think 4! minor version bumps because we had done something stupid in the sdist. Also -- the latest numpy release got caught in this, too: """ * We ran into a problem with pipy not allowing reuse of filenames and a resulting proliferation of *.*.*.postN releases. Not only were the names getting out of hand, some packages were unable to work with the postN suffix. """ So -- I propose that PyPi allow projects to replace existing files if they REALLY REALLY want to. You should have to jump through all sorts of hoops, and make it really clear that it is a BAD IDEA in the general case, but it'd be good to have it be possible. After all -- PyPi does not take on responsibility for anything else about what's in those packages, and Python itself is all about "we're all consenting adults here" I suppose we could even put in some heuristics about how long the file as been there, how many times it's been downloaded, etc. Just a thought.....I really hate systems that don't let me roll back mistakes, even when I discover them almost immediately... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker(a)noaa.gov

13 24

ImportError running 'test' on namespace package - other package in namespace not found
by Jason R. Coombs 17 Apr '20

17 Apr '20

I've got two projects: mynamespace.myprojectA and mynamespace.myprojectB myprojectB depends on myprojectA. I'm using setuptools 0.6c8 to manage both projects. Both projects are registered using 'setup develop'. Both projects are accessible from an interactive interpreter: PS C:\Users\me\projects> python Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import mynamespace.myprojectA >>> import mynamespace.myprojectB >>> from mynamespace.myprojectA import mymoduleZ However, when I run 'setup test' in myprojectB, the tests fail with File ".mymoduleZ.py", line NNN, in [some context] from mynamespace.myprojectA.mymoduleZ import MyClassC ImportError: No module named myprojectA.mymoduleZ In setup.py, the test_suite is nose.collector. I searched and couldn't find anyone else with this problem. Is this a supported configuration? Is there something I can do to make tests work with interdependent projects with the same root namespace? If there's not something obvious I should be doing differently, I'm happy to put together a minimal test case that reproduces the problem. Any suggestions are appreciated. Sincerely, Jason R. Coombs

3 6

Surviving a Compromise of PyPI - PEP 458 and 480
by Vladimir Diaz 12 Feb '20

12 Feb '20

Hello everyone, I am a research programmer at the NYU School of Engineering. My colleagues (Trishank Kuppusamy and Justin Cappos) and I are requesting community feedback on our proposal, "Surviving a Compromise of PyPI." The two-stage proposal can be reviewed online at: PEP 458 http://legacy.python.org/dev/peps/pep-0458/ PEP 480 http://legacy.python.org/dev/peps/pep-0480/ Summary of the Proposal: "Surviving a Compromise of PyPI" proposes how the Python Package Index (PyPI) can be amended to better protect end users from altered or malicious packages, and to minimize the extent of PyPI compromises against affected users. The proposed integration allows package managers such as pip to be more secure against various types of security attacks on PyPI and defend end users from attackers responding to package requests. Specifically, these PEPs describe how PyPI processes should be adapted to generate and incorporate repository metadata, which are signed text files that describe the packages and metadata available on PyPI. Package managers request (along with the packages) the metadata on PyPI to verify the authenticity of packages before they are installed. The changes to PyPI and tools will be minimal by leveraging a library, The Update Framework <https://github.com/theupdateframework/tuf>, that generates and transparently validates the relevant metadata. The first stage of the proposal (PEP 458 <http://legacy.python.org/dev/peps/pep-0458/>) uses a basic security model that supports verification of PyPI packages signed with cryptographic keys stored on PyPI, requires no action from developers and end users, and protects against malicious CDNs and public mirrors. To support continuous delivery of uploaded packages, PyPI administrators sign for uploaded packages with an online key stored on PyPI infrastructure. This level of security prevents packages from being accidentally or deliberately tampered with by a mirror or a CDN because the mirror or CDN will not have any of the keys required to sign for projects. The second stage of the proposal (PEP 480 <http://legacy.python.org/dev/peps/pep-0480/>) is an extension to the basic security model (discussed in PEP 458) that supports end-to-end verification of signed packages. End-to-end signing allows both PyPI and developers to sign for the packages that are downloaded by end users. If the PyPI infrastructure were to be compromised, attackers would be unable to serve malicious versions of these packages without access to the project's developer key. As in PEP 458, no additional action is required by end users. However, PyPI administrators will need to periodically (perhaps every few months) sign metadata with an offline key. PEP 480 also proposes an easy-to-use key management solution for developers, how to interface with a potential build farm on PyPI infrastructure, and discusses the security benefits of end-to-end signing. The second stage of the proposal simultaneously supports real-time project registration and developer signatures, and when configured to maximize security on PyPI, less than 1% of end users will be at risk even if an attacker controls PyPI and goes undetected for a month. We thank Nick Coghlan and Donald Stufft for their valuable contributions, and Giovanni Bajo and Anatoly Techtonik for their feedback. Thanks, PEP 458 & 480 authors.

7 53

proposing Python package index upload API spec (potential PEP)
by Sumana Harihareswara 18 Nov '19

18 Nov '19

As a new Twine maintainer I've been running into questions like: * Now that Warehouse doesn't use "register" anymore, can we deprecate it from distutils, setuptools, and twine? Are any other package indexes or upload tools using it? https://github.com/pypa/twine/issues/311 * It would be nice if Twine could depend on a package index providing an HTTP 201 response in response to a successful upload, and fail on 200 (a response some non-package-index servers will give to an arbitrary POST request). I do not see specifications to guide me here, e.g., in the official guidance on hosting one's own package index https://packaging.python.org/guides/hosting-your-own-index/ . PEP 301 was long enough ago that it's due an update, and PEP 503 only concerns browsing and download, not upload. I suggest that I write a PEP specifying an API for uploading to a Python package index. This PEP would partially supersede PEP 301 and would document the Warehouse reference implementation. I would write it in collaboration with the Warehouse maintainers who will develop the reference implementation per pypa/warehouse/issues/284 and maybe add a header referring to compliance with this new standard. And I would consult with the maintainers of packaging and distribution tools such as zest.releaser, flit, poetry, devpi, pypiserver, etc. Per Nick Coghlan's formulation, my specific goal here would be close to: > Documenting what the current upload API between twine & warehouse actually is, similar to the way PEP 503 focused on describing the status quo, without making any changes to it. That way, other servers (like devpi) and other upload clients have the info they need to help ensure interoperability. Since Warehouse is trying to redo its various APIs in the next several months, I think it might be more useful to document and work with the new upload API, but I'm open to feedback on this. After a little conversation here on distutils-sig, I believe my steps would be: 1. start a very early PEP draft with lots of To Be Determined blanks, submit as a PR to the python/peps repo, and share it with distutils-sig 2. ping maintainers of related tools 3. discuss with others at the packaging sprints https://wiki.python.org/psf/PackagingSprints next week 4. revise and get consensus, preferably mostly on this list 5. finalize PEP and get PEP accepted by BDFL-Delegate 6. coordinate with PyPA, maintainers of `distutils`, maintainers of packaging and distribution tools, and documentation maintainers to implement PEP compliance Thoughts are welcome. I originally posted this at https://github.com/pypa/packaging-problems/issues/128 . -- Sumana Harihareswara Changeset Consulting https://changeset.nyc

4 4

PyPI will no longer accept compromised passwords!
by Donald Stufft 19 Jun '19

19 Jun '19

I just wanted to give everyone a heads up that PyPI will no longer accept passwords that have been published in data breaches. For background you can take a look at https://github.com/pypa/warehouse/pull/4541 <https://github.com/pypa/warehouse/pull/4541>. For high level overview see https://pypi.org/help/#compromised-password <https://pypi.org/help/#compromised-password>. Finally if you have any trouble, please file an issue at https://github.com/pypa/warehouse/issues/new/choose <https://github.com/pypa/warehouse/issues/new/choose>.

2 2

Re: [Distutils] [Python-ideas] Pypi private repo's
by Alex Walters 17 Oct '18

17 Oct '18

I am fairly sure if you give the PyPA that suggestion, they will just deflate at the thought of the workload. Besides, we already offer private repos for free, several ways ranging from devpi to python -m SimpleHTTPServer in a specially created directory. From: Python-ideas <python-ideas-bounces+tritium-list=sdamon.com(a)python.org> On Behalf Of Nick Humrich Sent: Wednesday, April 4, 2018 12:26 PM To: python-ideas(a)python.org Subject: [Python-ideas] Pypi private repo's I am sure this has been discussed before, and this might not even be the best place for this discussion, but I just wanted to make sure this has been thought about. What if pypi.org <http://pypi.org> supported private repos at a cost, similar to npm? This would be able to help support the cost of pypi, and hopefully make it better/more reliable, thus in turn improving the python community. If this discussion should happen somewhere else, let me know. Nick

5 5

manylinux1 guidelines for zlib?
by Antoine Pitrou 15 Oct '18

15 Oct '18

Hello, Surprisingly, the manylinux1 spec doesn't seem to include the zlib in the list of known-to-be-available libraries (are there GNU/Linux systems out there without a zlib installed?). Since I'm assuming several packages already had a need for that, is there a recommended way to link in the zlib as part of a manylinux1 wheel? Would you recommend static linking with a private version, or dynamic linking? Regards Antoine.

4 6

Re: disable building wheel for a package
by Jeroen Demeyer 04 Oct '18

04 Oct '18

On 2018-09-14 12:55, Alex Grönholm wrote: > I'm curious: what data does it attempt to install and where? Have you > created a ticket for this somewhere? The OP mentioned absolute paths. However, it really sounds like a bad idea to hard-code an absolute installation path. Let's consider it a feature that wheel doesn't support that. See https://github.com/pypa/wheel/issues/92

16 76

Notes from python core sprint on workflow tooling
by Nathaniel Smith 01 Oct '18

01 Oct '18

Now that the basic wheels/pip/PyPI infrastructure is mostly functional, there's been a lot of interest in improving higher-level project workflow. We have a lot of powerful tools for this – virtualenv, pyenv, conda, tox, pipenv, poetry, ... – and more in development, like PEP 582 [1], which adds a support for project-local packages directories (`__pypackages__/`) directly to the interpreter. But to me it feels like right now, Python workflow tools are like the blind men and the elephant [2]. Each group sees one part of the problem, and so we end up with one set of people building legs, another a trunk, a third some ears... and there's no overall plan for how they can fit together. For example, PEP 582 is trying to solve the problem that virtualenv is really hard to use for beginners just starting out [3]. This is a serious problem! But I don't want a solution that *only* works for beginners starting out, so that once they get a little more sophisticated they have to throw it out and learn something new from scratch. So I think now might be a time for a bit of top-down design. **I want a picture of the elephant.** If we had that, maybe we could see how all these different ideas could be put together into a coherent whole. So at the Python core sprint a few weeks ago, I dragged some interested parties [4] into a room with a whiteboard [5], and we made a start at it. And now I'm writing it up to share with you all. This is very much a draft, intended as a seed for discussion, not a conclusion. [1] https://www.python.org/dev/peps/pep-0582/ [2] https://en.wikipedia.org/wiki/Blind_men_and_an_elephant [3] https://www.python.org/dev/peps/pep-0582/#motivation [4] I won't try to list names, because I know I'll forget someone, and I don't know if everyone would agree with everything I wrote there. But thank you all! [5] https://photos.app.goo.gl/4HfY8P3ESPNi9oLMA, including special guest appearance by Kushal's elbow # The idealized lifecycle of a Python project ## 1. Beginner Everyone starts out as a rank beginner. This may be the first time they have programmed at all. At this stage, users want to: - install *one* thing to get started (e.g. python itself) - write and run simple scripts (standalone .py files) - run a REPL - install and use PyPI packages like requests or numpy - install and use tools like jupyter - their IDE should also be able to find these packages/tools Over time, they'll probably end up with multiple scripts, and maybe want to organize them into subdirectories. The above should all work from subdirectories. ## 2. Sharing with others Now we have a neat little script. Or maybe we've made a pretty jupyter notebook that computes some crucial business analytics. We want to share it with our friends or coworkers. We still need the features above; and now we also care about: - version control - some way for our friend to reconstruct, on their computer: - the same PyPI packages that we were using - the same tools that we were using - the ways we invoked those tools This last point is important: as projects grow in complexity, and are used by a wider audience, they often end up with fairly complex tool specifications that have to be shared among a team. For example: - to run tests: in an environment that has pytest, pytest-cov, and pytest-trio installed, and with our project working directory on PYTHONPATH, run `pytest -Werror --cov ...` - to format code: in an environment using python 3.6 or later, that has black installed, run `black -l 79 *.py my-util-directory/*.py` This kind of tool specification also puts us in a good position to set up CI when we reach that point. At this point our project can grow in a few different directions. ## 3a. Deployable webapp This adds the requirement to "deploy". I think this is mostly covered by the set-up-an-environment-to-run-a-command functionality already described? I'm not super familiar with this, but it's pipenv's core target, and pipenv doesn't have much more than that, so I assume that's about right... ## 3b. Reusable library For this we also need to: - Build sdists and wheels - Which means: pyproject.toml, and some way to invoke it - Install our library into our environments - Including dependency locking (best practice is to not pin dependencies in wheel metadata, but to pin all dependencies in CI; so there needs to be some way to track those separately, but integrated enough that it's not a huge ceremony to add or change a dependency) ## 3c. Reusable standalone app I think this is pretty much like the "Reusable library", except that it'd be nice to have better tools to build/distribute standalone applications. But if we had them, we could invoke them the same way as we invoke other build systems? # How do existing tools/proposals fit into this picture? pyenv, virtualenv, and conda all solve parts of the "create an environment" problem, but consider the other aspects out-of-scope. tox solves the problem of keeping a shared record of how to run a bunch of different tools in the appropriate environments, but doesn't handle pinning or procuring appropriate python versions, and requires a separate bootstrapping step to install tox. `__pypackages__` (if implemented) makes it very easy for beginners to use PyPI packages in their own scripts and from the REPL; in particular, it would be part of python, so it meets the "install *one* thing" criterion. But, it doesn't provide any way to run tools. (There's no way to put `__pypackages__/bin` on PATH.) It doesn't allow scripts to be organized into subdirectories. (For security reasons, we can't have the python interpreter going off walking the filesystem looking for `__pypackages__/`, so the PEP specifies that `__pypackages__/` has to be in the same directory as the script that uses it.) There's no way to share your `__pypackages__` environment with a friend. So... it seems like a something that people would outgrow very quickly. pipenv and poetry are interesting. Their basic strategy is to say, there is a top-level command that acts as your entry point to performing workflow actions on on a python project (`pipenv` or `poetry`, respectively). And this strategy at least in principle can solve the problems that `__pypackages__/` runs into. In particular, it doesn't rely on `$PATH`, so it can run tools; and because it's a dedicated project management tool, it can go looking for the project marker file. # A fantastic elephant So if our idealized user had an idealized tool, what would that look like? They'll be interacting with Python through a dedicated tool, similar to pipenv or poetry. In my little fantasy here I'll call it `pyp`, because (a) I want to be neutral, (b) 6 characters is too long. To get this tool, either they install Python (via python.org download, apt, homebrew, whatever), and the tool is automatically included. Or else, they install the tool directly, and it has the ability to install Python interpreters when needed. Once they have the tool, they start by making a new directory for their project (this way they're ready to switch to version control later). Then they somehow mark this directory as being a "python project root". I guess the UI would be something like `pyp new <name>` and it just does it for you, but we have to figure out what this creates on disk. We need some sort of marker file. Files that currently serve this kind of role include tox.ini, Pipfile, pyproject.toml, __pypackages__, ... But only one of these is a standard thing we're already committed to sticking with, so, pyproject.toml it is. Let's make it the marker for any python project, not just redistributable libraries. (And if we do grow up into a redistributable library, then we're already prepared.) In the initial default configuration, there's a single default environment. You can install things with `pyp install ...` or `pyp uninstall ...`, and it tracks the requested packages in some standardized way in pyproject.toml, and also pins specific versions somewhere (could be pyproject.toml again I guess, or poetry's pyproject.lock would work too). This way when we decide to share our project later, our friends can recreate our environment on their system. However, there's also the capability to configure multiple custom execution environments, including python version and installed packages. And the capability to configure new aliases like `pyp test` or `pyp reformat`, which run some specified command in a specified environment. Since the install/locking metadata is all standardized, you can even switch between competing tools, and integrate with third-party tools like pyup.io. For redistributable libraries, we also need some way to get the wheel metadata and the workflow metadata to play nicely together. Maybe this means that we need a standardized install-requires field in pyproject.toml, so that build backends and workflow tools have a shared source of truth? # What's wrong with pipenv? Since pipenv is the tool that those of us in the room were most familiar with, that comes closest to matching this vision, we brainstormed a list of complaints about it. Some of these are more reasonable than others. - Not ambitious enough. This is a fuzzy sort of thing, but perception matters, and it's right there in the name: it's a tool to use pip, to manage an environment. If we're reconceiving this as the grand unified entryway to all of Python, then the name starts to feel pretty weird. The whole thing where it's only intended to work for webapp-style projects would have to change. - Uses Pipfile as a project marker instead of pyproject.toml. - Not shipped with Python. (Obviously not pipenv's fault, but nonetheless.) - Environments should be stored in project directory, not off in $HOME somewhere. (Not sure what this is about, but some of the folks present were quite insistent.) - Environments should be relocatable. - Hardcoded to only support "default" and "dev" environments, which is insufficient. - No mechanism for sharing prespecified commands like "run tests" or "reformat". - Can't install Python. (There's... really no reason we *couldn't* distribute pre-built Python interpreters on PyPI? between the python.org installers and the manylinux image, we're already building redistributable run-anywhere binaries for the most popular platforms on every Python release; we just aren't zipping them up and putting them on PyPI.) -n -- Nathaniel J. Smith -- https://vorpus.org

7 19