I wrote down a tought about Serverside Dependency Resolution and Virtualenv Build Server
What do you think?
Latest version: https://github.com/guettli/virtualenv-build-server
virtualenv-build-server
#######################
Rough roadmap how a server to build virtualenvs for the python programming language could be implemented.
Highlevel goal
--------------
Make creating new virtual envionments for the Python programming language easy and fast.
Input: fuzzy requirements like this: django>=1.8, requests=>2.7
Output: virtualenv with packages installed.
Two APIs
------------
#. Resolve fuzzy requirements to a fixed set of packages with exactly pinned versions.
#. Read fixed set of packages. Build virtualenv according to given platform.
Steps
-----
Steps:
#. Client sends list of fuzzy requirements to server:
* I need: django>=1.8, requests=>2.7, ...
#. Server solves the fuzzy requirements to a fixed set of requirememnts: django==1.8.2, requests==2.8.1, ...
#. Client reads the fixed set of requirements.
#. Optional: Client sends fixed set of requirements to the server. Telling him the plattform
* My platform: sys.version==2.7.6 and sys.platform=linux2
#. Server builds a virtualenv according to the fixed set of requirements.
#. Server sends the environment to the client
#. Client unpacks the data and has a usable virtualenv
Benefits
--------
Speed:
* There is only one round-trip from client to server. If the dependencies get resolved on the client the client would need to download the available version information.
* Caching: If the server gets input parameters (fuzzy requirements and platform information) which he has seen before, he can return the cached result from the previous request.
Possible Implementations
------------------------
APIs
====
Both APIs could be implementated by a webservice/Rest interface passing json or yaml.
Serverside
==========
Implementation Strategie "PostgreSQL"
.....................................
Since the API is de-coupled from the internals the implementation could be exchanged without the need for changes at the client side.
I suggest using the PostgreSQL und resolving the dependcy graph using SQL (WITH RECURSIVE).
The package and version data gets stored in PostgreSQL via ORM (Django or SQLAlchemy).
The version numbers need to be normalized to ascii to allow fast comparision.
Related: https://www.python.org/dev/peps/pep-0440/
Implementation Strategie "Node.js"
..................................
I like python, but I am not married with it. Why not use a different tools that is already working? Maybe the node package manager: https://www.npmjs.com/
Questions
---------
Are virtualenv relocatable? AFAIK they are not.
General Thoughts
----------------
* Ignore Updates. Focus on creating new virtualenvs. The server can do caching and that's why I prefer creating virtualenvs which never get updated. They get created and removed (immutable).
I won't implement it
--------------------
This idea is in the public domain. If you are young and brave or old and wise: Go ahead, try to implement it. Please communicate early and often. Ask on mailing-lists or me for feedback. Good luck :-)
I love feedback
---------------
Please tell me want you like or dislike:
* typos and spelling stuff (I am not a native speaker)
* alternative implementation strategies.
* existing software which does this (even if implemented in a different programming language).
* ...
--
http://www.thomas-guettler.de/
On Sun, Nov 8, 2015 at 5:28 PM, Robert Collins
<robertc(a)robertcollins.net> wrote:
> +The use of a command line API rather than a Python API is a little
> +contentious. Fundamentally anything can be made to work, and Robert wants to
> +pick something thats sufficiently lowest common denominator that
> +implementation is straight forward on all sides. Picking a CLI for that makes
> +sense because all build systems will need a CLI for end users to use anyway.
I agree that this is not terribly important, and anything can be made
to work. Having pondered it all for a few more weeks though I think
that the "entrypoints-style" interface actually is unambiguously
better, so let me see about making that case.
What's at stake?
----------------------
Option 1, as in Robert's PEP:
The build configuration file contains a string like "flit
--dump-build-description" (or whatever), which names a command to run,
and then a protocol for running this command to get information on the
actual build system interface. Build operations are performed by
executing these commands as subprocesses.
Option 2, my preference:
The build configuration file contains a string like
"flit:build_system_api" (or whatever) which names a Python object
accessed like
import flit
flit.build_system_api
(This is the same syntax used for naming entry points.) Which would
then have attributes and methods describing the actual build system
interface. Build operations are performed by calling these methods.
Why does it matter?
----------------------------
First, to be clear: I think that no matter which choice we make here,
the final actual execution path is going to end up looking very
similar. Because even if we go with the entry-point-style Python
hooks, the build frontends like pip will still want to spawn a child
to do the actual calls -- this is important for isolating pip from the
build backend and the build backend from pip, it's important because
the build backend needs to execute in a different environment than pip
itself, etc. So no matter what, we're going to have some subprocess
calls and some IPC.
The difference is that in the subprocess approach, the IPC machinery
is all written into the spec, and build frontends like pip implement
one half while build backends implement the other half. In the Python
API approach, the spec just specifies the Python calling conventions,
and both halves of the IPC code live are implemented inside each build
backend.
Concretely, the way I imagine this would work is that pip would set up
the build environment, and then it would run
build-environment/bin/python path/to/pip-worker-script.py <args>
where pip-worker-script.py is distributed as part of pip. (In simple
cases it could simply be a file inside pip's package directory; if we
want to support execution from pip-inside-a-zip-file then we need a
bit of code to unpack it to a tempfile before executing it. Creating a
tempfile is not a huge additional burden given that by the time we
call build hooks we will have already created a whole temporary python
environment...)
In the subprocess approach, we have to write a ton of text describing
all the intricacies of IPC. We have to specify how the command line
gets split (or is it passed to the shell?), and specify a JSON-based
protocol, and what happens to stdin/stdout/stderr, and etc. etc. In
the Python API approach, we still have to do all the work of figuring
these things out, but they would live inside pip's code, instead of in
a PEP. The actual PEP text would be much smaller.
It's not clear which approach leads to smaller code overall. If there
are F frontends and B backends, then in the subprocess approach we
collectively have to write F+B pieces of IPC code, and in the Python
API approach we collectively have to write 2*F pieces of IPC code. So
on this metric the Python API is a win if F < B, which would happen if
e.g. everyone ends up using pip for their frontend but with lots of
different backends, which seems plausible? But who knows.
But now suppose that there's some bug in that complicated IPC protocol
(which I would rate as about a 99.3% likelihood in our first attempt,
because cross-platform compatible cross-process IPC is super annoying
and fiddly). In the subprocess approach, fixing this means that we
need to (a) write a PEP, and then (b) fix F+B pieces of code
simultaneously on some flag day, and possibly test F*B combinations
for correct interoperation. In the Python API approach, fixing this
means patching whichever frontend has the bug, no PEPs or flag days
necessary.
In addition, the ability to evolve the two halves of the IPC channel
together allows for better efficiency. For example, in Robert's
current PEP there's some machinery added that hopes to let pip cache
the result of the "--dump-build-description" call. This is needed
because in the subprocess approach, the minimum number of subprocess
calls you need to do something is two: one to ask what command to
call, and a second to actually execute the command. In the python API
approach, you can just go ahead and spawn a subprocess that knows what
method it wants to call, and it can locate that method and then call
it in a single shot, thus avoiding the need for an error-prone caching
scheme.
And the flexibility also helps in the face of future changes, too.
Like, suppose that we start out with a do_build hook, and then later
add a do_build2 hook that takes an extra argument or something, and
pip wants to call do_build2 if it exists, and fall back on do_build
otherwise. In the subprocess approach, you have to get the build
description, check which hooks are provided, and then once you've
decided which one you want to call you can spawn a second subprocess
to do that. In the python API approach, pip can move this fallback
logic directly into its hook-calling worker. (If it wants to.) So it
still avoids the extra subprocess call.
Finally, I think that it probably is nicer for pip to bite the bullet
and take on more of the complexity budget here in order to make things
simpler for build backends, because pip is already a highly complex
project that undergoes lots of scrutiny from experts, which is almost
certainly not going to be as true for all build backends. And the
Python API approach is dead simple to explain and implement for the
build backend side. I understand that the pip devs who are reading
this might disagree, which is why I also wrote down the (IMO) much
more compelling arguments above :-). But hey, still worth
mentioning...
-n
--
Nathaniel J. Smith -- http://vorpus.org
We just pushed devpi-{server,web,client,common} release files out to pypi.
Most notably, the private pypi package server allows much faster installs
due to much improved simple-page serving speed. See the changelog
below for a host of other changes and fixes as well as for compatibility
warnings.
Docs about the devpi system are to be found here:
http://doc.devpi.net
Many thanks to my co-maintainer Florian Schulze and particularly
to Stephan Erb and Chad Wagner for their contributions.
cheers,
holger
--
about me: http://holgerkrekel.net/about-me/
contracting: http://merlinux.eu
devpi-server 2.4.0 (2015-11-11)
-------------------------------
- NOTE: devpi-server-2.4 is compatible to data from devpi-server-2.3 but
not the other way round. Once you run devpi-server-2.4 you can not go
back. It's always a good idea to make a backup before trying a new version :)
- NOTE: if you use "--logger-cfg" with .yaml files you will need to
install pyyaml yourself as devpi-server-2.4 dropped it as a direct
dependency as it does not install for win32/python3.5 and is
not needed for devpi-server operations except for logging configuration.
Specifying a *.json file always works.
- add timeout to replica requests
- fix issue275: improve error message when a serverdir exists but has no
version
- improve testing mechanics and name normalization related to storing doczips
- refine keyfs to provide lazy deep readonly-views for
dict/set/list/tuple types by default. This introduces safety because
users (including plugins) of keyfs-values can only write/modify a value
by explicitly getting it with readonly=False (thereby deep copying it)
and setting it with the transaction. It also allows to avoid unnecessary
copy-operations when just reading values.
- fix issue283: pypi cache didn't work for replicas.
- performance improvements for simple pages with lots of releases.
this also changed the db layout of the caching from pypi.python.org mirrors
but will seamlessly work on older data, see NOTE at top.
- add "--profile-requests=NUM" option which turns on per-request
profiling and will print out after NUM requests are executed
and then restart profiling.
- fix tests for pypy. We officially support pypy now.
devpi-client-2.3.2 (2015-11-11)
-------------------------------
- fix git submodules for devpi upload. ``.git`` is a file not a folder for
submodules. Before this fix the repository which contains the submodule was
found instead, which caused a failure, because the files aren't tracked there.
- new option "devpi upload --setupdir-only" which will only
vcs-export the directory containing setup.py. You can also
set "setupdirs-only = 1" in the "[devpi:upload]" section
of setup.cfg for the same effect. Thanks Chad Wagner for the PR.
devpi-web 2.4.2 (2015-11-11)
----------------------------
- log exceptions during search index updates.
- adapted tests/code to work with devpi-server-2.4
devpi-common 2.0.8 (2015-11-11)
-------------------------------
- fix URL.joinpath to not add double slashes
We've got to a point where the original standing delegations to myself
and Richard Jones to act as BDFL-Delegates for metadata
interoperability and pypi.python.org related aren't scaling
adequately, so given Paul's recent delegation for PEP 470, and Donald
handling PEP 503 directly, it seems like an opportune time to put
something in writing about that.
For PyPA/distutils-sig specific PEPs, we've effectively adopted the
following approach to assigning BDFL-Delegates in resolving PEPs 470
and 503:
=================================
Whenever a new PEP is put forward on distutils-sig, any PyPA core
reviewer that believes they are suitably experienced to make the final
decision on that PEP may offer to serve as the BDFL's delegate (or
"PEP czar") for that PEP. If their self-nomination is accepted by the
other PyPA core reviewer, the lead PyPI maintainer and the lead
CPython representative on distutils-sig, then they will have the
authority to approve (or reject) that PEP.
=================================
And translating the nominated roles to the folks currently filling
them: "lead PyPI maintainer" = Donald Stufft; "lead CPython
representative on distutils-sig" = me.
"PyPA core reviewer" isn't a term we've previously used, but I'm
aiming to capture "has approval rights for pull requests to one or
more of the PyPA maintained source code or documentation repos".
Some further details for the benefit of folks not aware of the relevant history:
* a couple of years ago, we amended PEP 1 to give the "Discussions-To"
header some additional force for PEPs which don't directly affect
CPython: """PEP review and resolution may also occur on a list other
than python-dev (for example, distutils-sig for packaging related PEPs
that don't immediately affect the standard library). In this case, the
"Discussions-To" heading in the PEP will identify the appropriate
alternative list where discussion, review and pronouncement on the PEP
will occur."""
* we *didn't* update the section about assignment of BDFL-Delegates.
Instead, I received a general delegation for packaging metadata
interoperability PEPs, and Richard Jones received one for
pypi.python.org related PEPs
* Richard subsequently passed the latter delegation on to Donald,
since Donald had taken over as the lead maintainer for PyPI
The section in PEP 1 for CPython BDFL-Delegates reads as follows:
=================================
However, whenever a new PEP is put forward, any core developer that
believes they are suitably experienced to make the final decision on
that PEP may offer to serve as the BDFL's delegate (or "PEP czar") for
that PEP. If their self-nomination is accepted by the other core
developers and the BDFL, then they will have the authority to approve
(or reject) that PEP.
=================================
This process can be appropriately described as "volunteering to be
blamed" - if a PEP from a less experienced contributor subsequently
proves to be a mistake, that's on the BDFL-Delegate for saying "Yes",
not on the PEP author for proposing it. Mostly though, it's so there's
someone to have the final say on the fiddly little details that go
into getting from a general concept to an actual implementation,
without getting mired down in design-by-committee on every incidental
detail.
As PEP authors, we'll also often ask someone else specifically to
volunteer as BDFL-Delegate, because we trust their judgement in
relation to the topic at hand (e.g. I asked Martin von Löwis to be
BDFL-Delegate for the original ensurepip PEP because I knew he was
skeptical of the idea, so a design that passed muster with him was
likely to have suitably addressed the ongoing maintainability
concerns. Guido did something similar when he asked Mark Shannon to be
BDFL-Delegate for PEP 484's gradual typing).
Regards,
Nick.
P.S. It's becoming clear to me that I should probably write a
companion PEP to PEP 1 that specifically describes distutils-sig's
usage of the PEP process (and how that differs from the normal CPython
processes), but hopefully this post provides sufficient clarification
for now.
--
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
Hi all,
Following the strategy of trying to break out the different
controversial parts of the new build system interface, here's some
proposed text defining the environment that a build frontend like pip
provides to a project-specific build backend.
Robert's PEP currently disclaims all of this as out-of-scope, but I
think it's good to get something down, since in practice we'll have to
figure something out before any implementations can exist. And I think
the text below pretty much hits the right points.
What might be controversial about this nonetheless is that I'm not
sure that pip *can* reasonably implement all the requirements as
written without adding a dependency on virtualenv (at least for older
pythons -- obviously this is no big deal for new pythons since venv is
now part of the stdlib). I think the requirements are correct, so...
Donald, what do you think?
-n
----
The build environment
---------------------
One of the responsibilities of a build frontend is to set up the
environment in which the build backend will run.
We do not require that any particular "virtual environment" mechanism
be used; a build frontend might use virtualenv, or venv, or no special
mechanism at all. But whatever mechanism is used MUST meet the
following criteria:
- All requirements specified by the project's build-requirements must
be available for import from Python.
- This must remain true even for new Python subprocesses spawned by
the build environment, e.g. code like::
import sys, subprocess
subprocess.check_call([sys.executable, ...])
must spawn a Python process which has access to all the project's
build-requirements. This is necessary e.g. for build backends that
want to run legacy ``setup.py`` scripts in a subprocess.
[TBD: the exact wording here will probably need some tweaking
depending on whether we end up using an entrypoint-like mechanism for
specifying build backend hooks (in which case we can assume that hooks
automatically have access to sys.executable), or a subprocess-based
mechanism (in which case we'll need some other way to communicate the
path to the python interpreter to the build backend, e.g. a PYTHON=
envvar). But the basic requirement is pretty much the same either
way.]
- All command-line scripts provided by the build-required packages
must be present in the build environment's PATH. For example, if a
project declares a build-requirement on `flit
<https://flit.readthedocs.org/en/latest/>`_, then the following must
work as a mechanism for running the flit command-line tool::
import subprocess
subprocess.check_call(["flit", ...])
A build backend MUST be prepared to function in any environment which
meets the above criteria. In particular, it MUST NOT assume that it
has access to any packages except those that are present in the
stdlib, or that are explicitly declared as build-requirements.
Recommendations for build frontends (non-normative)
...................................................
A build frontend MAY use any mechanism for setting up a build
environment that meets the above criteria. For example, simply
installing all build-requirements into the global environment would be
sufficient to build any compliant package -- but this would be
sub-optimal for a number of reasons. This section contains
non-normative advice to frontend implementors.
A build frontend SHOULD, by default, create an isolated environment
for each build, containing only the standard library and any
explicitly requested build-dependencies. This has two benefits:
- It allows for a single installation run to build multiple packages
that have contradictory build-requirements. E.g. if package1
build-requires pbr==1.8.1, and package2 build-requires pbr==1.7.2,
then these cannot both be installed simultaneously into the global
environment -- which is a problem when the user requests ``pip install
package1 package2``. Or if the user already has pbr==1.8.1 installed
in their global environment, and a package build-requires pbr==1.7.2,
then downgrading the user's version would be rather rude.
- It acts as a kind of public health measure to maximize the number of
packages that actually do declare accurate build-dependencies. We can
write all the strongly worded admonitions to package authors we want,
but if build frontends don't enforce isolation by default, then we'll
inevitably end up with lots of packages on PyPI that build fine on the
original author's machine and nowhere else, which is a headache that
no-one needs.
However, there will also be situations where build-requirements are
problematic in various ways. For example, a package author might
accidentally leave off some crucial requirement despite our best
efforts; or, a package might declare a build-requirement on `foo >=
1.0` which worked great when 1.0 was the latest version, but now 1.1
is out and it has a showstopper bug; or, the user might decide to
build a package against numpy==1.7 -- overriding the package's
preferred numpy==1.8 -- to guarantee that the resulting build will be
compatible at the C ABI level with an older version of numpy (even if
this means the resulting build is unsupported upstream). Therefore,
build frontends SHOULD provide some mechanism for users to override
the above defaults. For example, a build frontend could have a
``--build-with-system-site-packages`` option that causes the
``--system-site-packages`` option to be passed to
virtualenv-or-equivalent when creating build environments, or a
``--build-requirements-override=my-requirements.txt`` option that
overrides the project's normal build-requirements.
The general principle here is that we want to enforce hygiene on
package *authors*, while still allowing *end-users* to open up the
hood and apply duct tape when necessary.
--
Nathaniel J. Smith -- http://vorpus.org
Based on discussions in another thread [1], I've posted a PR to pypa.io for
a "PyPA Roadmap"
PR: https://github.com/pypa/pypa.io/pull/7
built version: http://pypaio.readthedocs.org/en/roadmap/roadmap/
To be clear, I'm not trying to dictate anything here, but rather just
trying to mirror what I think is going on for the sake of new (or old)
people, who don't have a full picture of the major todo items.
I'm asking for help to make this as accurate as possible and to keep it
accurate as our plans change.
thanks,
Marcus
[1]
https://mail.python.org/pipermail/distutils-sig/2015-October/027346.html ,
although it seems a number of emails in this thread never made it to the
archive due to the python mail server failure.
Hi all,
Here's a quick update to my draft PEP for a new build system
interface, last seen here:
https://mail.python.org/pipermail/distutils-sig/2015-October/027360.html
There isn't terribly much here, and Robert and I should really figure
out how to reconcile what we have, but since I was rearranging some
stuff anyway and prepping possible new sections, I figured I'd at
least post this. I addressed all the previous comments so hopefully it
is boring and non-controversial :-).
Changes:
- It wasn't clear that the sdist metadata stuff was really helping, so
I took it out for now. So it doesn't get lost, I split it out as a
standalone deferred-status PEP to the pypa repository:
https://github.com/pypa/interoperability-peps/pull/57
- Rewrote stuff to deal with Paul's comments
- Added new terminology: "build frontend" for something like pip, and
"build backend" for the project specific hooks that pip calls. Seems
helpful.
-n
----
PEP: ??
Title: A build-system independent format for source trees
Version: $Revision$
Last-Modified: $Date$
Author: Nathaniel J. Smith <njs(a)pobox.com>
Status: Draft
Type: Standards-Track
Content-Type: text/x-rst
Created: 30-Sep-2015
Post-History: 1 Oct 2015, 25 Oct 2015
Discussions-To: <distutils-sig(a)python.org>
Abstract
========
While ``distutils`` / ``setuptools`` have taken us a long way, they
suffer from three serious problems: (a) they're missing important
features like usable build-time dependency declaration,
autoconfiguration, and even basic ergonomic niceties like `DRY
<https://en.wikipedia.org/wiki/Don%27t_repeat_yourself>`_-compliant
version number management, and (b) extending them is difficult, so
while there do exist various solutions to the above problems, they're
often quirky, fragile, and expensive to maintain, and yet (c) it's
very difficult to use anything else, because distutils/setuptools
provide the standard interface for installing packages expected by
both users and installation tools like ``pip``.
Previous efforts (e.g. distutils2 or setuptools itself) have attempted
to solve problems (a) and/or (b). This proposal aims to solve (c).
The goal of this PEP is get distutils-sig out of the business of being
a gatekeeper for Python build systems. If you want to use distutils,
great; if you want to use something else, then that should be easy to
do using standardized methods. The difficulty of interfacing with
distutils means that there aren't many such systems right now, but to
give a sense of what we're thinking about see `flit
<https://github.com/takluyver/flit>`_ or `bento
<https://cournape.github.io/Bento/>`_. Fortunately, wheels have now
solved many of the hard problems here -- e.g. it's no longer necessary
that a build system also know about every possible installation
configuration -- so pretty much all we really need from a build system
is that it have some way to spit out standard-compliant wheels and
sdists.
We therefore propose a new, relatively minimal interface for
installation tools like ``pip`` to interact with package source trees
and source distributions.
Terminology and goals
=====================
A *source tree* is something like a VCS checkout. We need a standard
interface for installing from this format, to support usages like
``pip install some-directory/``.
A *source distribution* is a static snapshot representing a particular
release of some source code, like ``lxml-3.4.4.zip``. Source
distributions serve many purposes: they form an archival record of
releases, they provide a stupid-simple de facto standard for tools
that want to ingest and process large corpora of code, possibly
written in many languages (e.g. code search), they act as the input to
downstream packaging systems like Debian/Fedora/Conda/..., and so
forth. In the Python ecosystem they additionally have a particularly
important role to play, because packaging tools like ``pip`` are able
to use source distributions to fulfill binary dependencies, e.g. if
there is a distribution ``foo.whl`` which declares a dependency on
``bar``, then we need to support the case where ``pip install bar`` or
``pip install foo`` automatically locates the sdist for ``bar``,
downloads it, builds it, and installs the resulting package.
Source distributions are also known as *sdists* for short.
A *build frontend* is a tool that users might run that takes arbitrary
source trees or source distributions and builds wheels from them. The
actual building is done by each source tree's *build backend*. In a
command like ``pip wheel some-directory/``, pip is acting as a build
frontend.
An *integration frontend* is a tool that users might run that takes a
set of package requirements (e.g. a requirements.txt file) and
attempts to update a working environment to satisfy those
requirements. This may require locating, building, and installing a
combination of wheels and sdists. In a command like ``pip install
lxml==2.4.0``, pip is acting as an integration frontend.
Source trees
============
We retroactively declare the legacy source tree format involving
``setup.py`` to be "version 0". We don't try to specify it further;
its de facto specification is encoded in the source code and
documentation of ``distutils``, ``setuptools``, ``pip``, and other
tools.
A "version 1" (or greater) source tree is any directory which contains
a file named ``pypackage.cfg``, which will -- in some manner whose
details are TBD -- describe the package's build dependencies and how
to invoke the project-specific build backend. This mechanism:
- Will allow for both static and dynamic specification of build dependencies
- Will have some degree of isolation of different builds from each
other, so that it will be possible for a single run of pip to install
one package that build-depends on ``foo == 1.1`` and another package
that build-depends on ``foo == 1.2``.
- Will leave the actual installation of the package in the hands of a
specialized installation tool like pip (i.e. individual package build
systems will not need to know about things like --user versus --global
or make decisions about when and how to modify .pth files)
[TBD: the exact set of operations to be supported and their detailed semantics]
[TBD: should builds be performed in a fully isolated environment, or
should they get access to packages that are already installed in the
target install environment? The former simplifies a number of things,
but Robert was skeptical it would be possible.]
[TBD: the form of the communication channel between an installation
tool like ``pip`` and the build system, over which these operations
are requested]
[TBD: the syntactic details of the configuration file format itself.
We can change the name too if we want, I just think it's useful to
have a single name to refer to it for now, and this is the last and
least interesting thing to figure out.]
Source distributions
====================
For now, we continue with the legacy sdist format which is mostly
undefined, but basically comes down to: a file named
{NAME}-{PACKAGE}.{EXT}, which unpacks into a buildable source tree.
Traditionally these have always contained "version 0" source trees; we
now allow them to also contain version 1+ source trees.
Integration frontends require that an sdist named
{NAME}-{PACKAGE}.{EXT} will generate a wheel named
{NAME}-{PACKAGE}-{COMPAT-INFO}.whl.
[TBD: whether we want to adopt a new sdist format along with this --
my read of the room is that it's sounding like people are leaning
towards deferring that for a separate round of standardization, but
we'll see what we think once some of the important details above have
been hammered out]
Evolutionary notes
==================
A goal here is to make it as simple as possible to convert old-style
sdists to new-style sdists. (E.g., this is one motivation for
supporting dynamic build requirements.) The ideal would be that there
would be a single static pypackage.cfg that could be dropped into any
"version 0" VCS checkout to convert it to the new shiny. This is
probably not 100% possible, but we can get close, and it's important
to keep track of how close we are... hence this section.
A rough plan would be: Create a build system package
(``setuptools_pypackage`` or whatever) that knows how to speak
whatever hook language we come up with, and convert them into calls to
``setup.py``. This will probably require some sort of hooking or
monkeypatching to setuptools to provide a way to extract the
``setup_requires=`` argument when needed, and to provide a new version
of the sdist command that generates the new-style format. This all
seems doable and sufficient for a large proportion of packages (though
obviously we'll want to prototype such a system before we finalize
anything here). (Alternatively, these changes could be made to
setuptools itself rather than going into a separate package.)
But there remain two obstacles that mean we probably won't be able to
automatically upgrade packages to the new format:
1) There currently exist packages which insist on particular packages
being available in their environment before setup.py is executed. This
means that if we decide to execute build scripts in an isolated
virtualenv-like environment, then projects will need to check whether
they do this, and if so then when upgrading to the new system they
will have to start explicitly declaring these dependencies (either via
``setup_requires=`` or via static declaration in ``pypackage.cfg``).
2) There currently exist packages which do not declare consistent
metadata (e.g. ``egg_info`` and ``bdist_wheel`` might get different
``install_requires=``). When upgrading to the new system, projects
will have to evaluate whether this applies to them, and if so they
will need to stop doing that.
Copyright
=========
This document has been placed in the public domain.
--
Nathaniel J. Smith -- http://vorpus.org
Sometimes it might be desirable to do wheel-like installs without actually
creating an archive. Instead, a whim (wheel internal manifest) file could
communicate the idea of wheel, just a bunch of files in categories, without
the zip file.
The format would be no more than a mapping of category names 'purelib',
'platlib', 'headers', 'scripts', 'data', to a list of tuples with the
file's path on the disk and its path relative to the category.
{ "category" : [ ('path on disk', 'path relative to category'), ... ] }
The dist-info directory could be segregated into its own category
'metadata' with each target path as "distname-1.0.dist-info/FILE" i.e. its
full path relative to the root of a wheel file.
An installer could consume whim directly bypassing zip. A wheel archiver
could consume a whim file and produce the archive with correct MANIFEST.
Since Nathaniel seems busy, I've taken the liberty of drafting a
narrow PEP based on the conversations that arose from the prior
discussion.
It (naturally) has my unique flavor, but builds on the work Nathaniel
had put together, so I've put his name as a co-author even though he
hasn't seen a word of it until now :) - all errors and mistakes are
therefore mine...
Current draft text in rendered form at:
https://gist.github.com/rbtcollins/666c12aec869237f7cf7
I've run it past Donald and he has a number of concerns - I think
we'll need to discuss them here, and possibly in another hangout, to
get a path forward.
Cheers,
Rob
--
Robert Collins <rbtcollins(a)hp.com>
Distinguished Technologist
HP Converged Cloud
Hi,
I just tried to run `pip install numpy` on my OS X 10.10.3 box, and it
proceeds to download and compile the tarball from PyPI from source (very
slow). I see, however, that pre-compiled OS X wheel files are available on
PyPI for OS X 10.6 and later.
Checking the code, it looks like pip is picking up the platform tag through
`distutils.util.get_platform()`, which returns 'macosx-10.5-x86_64' on this
machine. At root, I think this comes from the MACOSX_DEPLOYMENT_TARGET=10.5
entry in the Makefile at `python3.5/config-3.5m/Makefile`. I know that this
value is used by distutils compiling python extension modules -- presumably
so that they can be distributed to any target machine with OS X >=10.5 --
so that's good. But is this the right thing for pip to be using when
checking whether a binary wheel is compatible? I see it mentioned
<https://www.python.org/dev/peps/pep-0425/#id13> in PEP 425, so perhaps
this was already hashed out on the list.
Best,
Robert