This is PEP 427: Wheel. A binary package format for Python. Because newegg was taken. Since the last submission, the signature specification has been made clearly optional / informative and does not attempt to specify any specific signing algorithm or how the signatures would fit into a security scheme that would necessarily exist outside of a single archive. JWS signatures are used in their current form by OpenID Connect and Mozilla Personas and are a useful way to implement basically raw public or secret key signatures. The embedded signature scheme in wheel should also not affect the current effort to define end-to-end security for PyPI in any way; it might be a useful complement for packages that are not hosted on PyPI at all. This version also does not depend on the in-process Metadata PEP 426. It has been cleaned up in several places regarding which PEPs it references to describe its contents. The WHEEL metadata file should contain all the information in the wheel filename itself, and the non-alphanumeric/unicode filename escaping rules are made official. Thanks. For your consideration, PEP: 427 Title: The Wheel Binary Package Format 0.1 Version: $Revision$ Last-Modified: $Date$ Author: Daniel Holth <dholth@fastmail.fm> BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com> Discussions-To: <distutils-sig@python.org> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 20-Sep-2012 Post-History: 18-Oct-2012, 15-Feb-2013 Abstract ======== This PEP describes a built-package format for Python called "wheel". A wheel is a ZIP-format archive with a specially formatted file name and the ``.whl`` extension. It contains a single distribution nearly as it would be installed according to PEP 376 with a particular installation scheme. Although a specialized installer is recommended, a wheel file may be installed by simply unpacking into site-packages with the standard 'unzip' tool while preserving enough information to spread its contents out onto their final paths at any later time. Note ==== This draft PEP describes version 0.1 of the "wheel" format. When the PEP is accepted, the version will be changed to 1.0. (The major version is used to indicate potentially backwards-incompatible changes to the format.) Rationale ========= Python needs a package format that is easier to install than sdist. Python's sdist packages are defined by and require the distutils and setuptools build systems, running arbitrary code to build-and-install, and re-compile, code just so it can be installed into a new virtualenv. This system of conflating build-install is slow, hard to maintain, and hinders innovation in both build systems and installers. Wheel attempts to remedy these problems by providing a simpler interface between the build system and the installer. The wheel binary package format frees installers from having to know about the build system, saves time by amortizing compile time over many installations, and removes the need to install a build system in the target environment. Details ======= Installing a wheel 'distribution-1.0.py32.none.any.whl' ------------------------------------------------------- Wheel installation notionally consists of two phases: - Unpack. a. Parse ``distribution-1.0.dist-info/WHEEL``. b. Check that installer is compatible with Wheel-Version. Warn if minor version is greater, abort if major version is greater. c. If Root-Is-Purelib == 'true', unpack archive into purelib (site-packages). d. Else unpack archive into platlib (site-packages). - Spread. a. Unpacked archive includes ``distribution-1.0.dist-info/`` and (if there is data) ``distribution-1.0.data/``. b. Move each subtree of ``distribution-1.0.data/`` onto its destination path. Each subdirectory of ``distribution-1.0.data/`` is a key into a dict of destination directories, such as ``distribution-1.0.data/(purelib|platlib|headers|scripts|data)``. The initially supported paths are taken from ``distutils.command.install``. c. If applicable, update scripts starting with ``#!python`` to point to the correct interpreter. d. Update ``distribution-1.0.dist.info/RECORD`` with the installed paths. e. Remove empty ``distribution-1.0.data`` directory. f. Compile any installed .py to .pyc. (Uninstallers should be smart enough to remove .pyc even if it is not mentioned in RECORD.) Recommended installer features '''''''''''''''''''''''''''''' Rewrite ``#!python``. In wheel, scripts are packaged in ``{distribution}-{version}.data/scripts/``. If the first line of a file in ``scripts/`` starts with exactly ``b'#!python'``, rewrite to point to the correct interpreter. Unix installers may need to add the +x bit to these files if the archive was created on Windows. Generate script wrappers. In wheel, scripts packaged on Unix systems will certainly not have accompanying .exe wrappers. Windows installers may want to add them during install. File Format ----------- File name convention '''''''''''''''''''' The wheel filename is ``{distribution}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl``. distribution Distribution name, e.g. 'django', 'pyramid'. version Distribution version, e.g. 1.0. build tag Optional build number. Must start with a digit. A tie breaker if two wheels have the same version. Sort as the empty string if unspecified, else sort the initial digits as a number, and the remainder lexicographically. language implementation and version tag E.g. 'py27', 'py2', 'py3'. abi tag E.g. 'cp33m', 'abi3', 'none'. platform tag E.g. 'linux_x86_64', 'any'. For example, ``distribution-1.0-1-py27-none-any.whl`` is the first build of a package called 'distribution', and is compatible with Python 2.7 (any Python 2.7 implementation), with no ABI (pure Python), on any CPU architecture. The last three components of the filename before the extension are called "compatibility tags." The compatibility tags express the package's basic interpreter requirements and are detailed in PEP 425. Escaping and Unicode '''''''''''''''''''' Each component of the filename is escaped by replacing runs of non-alphanumeric characters with an underscore ``_``:: re.sub("[^\w\d.]+", "_", distribution, re.UNICODE) The filename is Unicode. It will be some time before the tools are updated to support non-ASCII filenames, but they are supported in this specification. File contents ''''''''''''' The contents of a wheel file, where {distribution} is replaced with the name of the package, e.g. ``beaglevote`` and {version} is replaced with its version, e.g. ``1.0.0``, consist of: #. ``/``, the root of the archive, contains all files to be installed in ``purelib`` or ``platlib`` as specified in ``WHEEL``. ``purelib`` and ``platlib`` are usually both ``site-packages``. #. ``{distribution}-{version}.dist-info/`` contains metadata. #. ``{distribution}-{version}.data/`` contains one subdirectory for each non-empty install scheme key not already covered, where the subdirectory name is an index into a dictionary of install paths (e.g. ``data``, ``scripts``, ``include``, ``purelib``, ``platlib``). #. Python scripts must appear in ``scripts`` and begin with exactly ``b'#!python'`` in order to enjoy script wrapper generation and ``#!python`` rewriting at install time. They may have any or no extension. #. ``{distribution}-{version}.dist-info/METADATA`` is Metadata version 1.1 or greater (PEP 314, PEP 345, PEP 426) format metadata. #. ``{distribution}-{version}.dist-info/WHEEL`` is metadata about the archive itself in the same basic key: value format:: Wheel-Version: 0.1 Generator: bdist_wheel 0.7 Root-Is-Purelib: true Tag: py2-none-any Tag: py3-none-any Build: 1 #. ``Wheel-Version`` is the version number of the Wheel specification. ``Generator`` is the name and optionally the version of the software that produced the archive. ``Root-Is-Purelib`` is true if the top level directory of the archive should be installed into purelib; otherwise the root should be installed into platlib. ``Tag`` is the wheel's expanded compatibility tags; in the example the filename would contain ``py2.py3-none-any``. ``Build`` is the build number and is omitted if there is no build number. #. A wheel installer should warn if Wheel-Version is greater than the version it supports, and must fail if Wheel-Version has a greater major version than the version it supports. #. Wheel, being an installation format that is intended to work across multiple versions of Python, does not generally include .pyc files. #. Wheel does not contain setup.py or setup.cfg. This version of the wheel specification is based on the distutils install schemes and does not define how to install files to other locations. The layout offers a superset of the functionality provided by the existing wininst and egg binary formats. The .dist-info directory ^^^^^^^^^^^^^^^^^^^^^^^^ #. Wheel .dist-info directories include at a minimum METADATA, WHEEL, and RECORD. #. METADATA is the package metadata, the same format as PKG-INFO as found at the root of sdists. #. WHEEL is the wheel metadata specific to a build of the package. #. RECORD is a list of (almost) all the files in the wheel and their secure hashes. Unlike PEP 376, every file except RECORD, which cannot contain a hash of itself, must include its hash. The hash algorithm must be sha256 or better; specifically, md5 and sha1 are not permitted, as signed wheel files rely on the strong hashes in RECORD to validate the integrity of the archive. #. PEP 376's INSTALLER and REQUESTED are not included in the archive. #. RECORD.jws is used for digital signatures. It is not mentioned in RECORD. #. RECORD.p7s is allowed as a courtesy to anyone who would prefer to use S/MIME signatures to secure their wheel files. It is not mentioned in RECORD. #. During extraction, wheel installers verify all the hashes in RECORD against the file contents. Apart from RECORD and its signatures, installation will fail if any file in the archive is not both mentioned and correctly hashed in RECORD. The .data directory ^^^^^^^^^^^^^^^^^^^ Any file that is not normally installed inside site-packages goes into the .data directory, named as the .dist-info directory but with the .data/ extension:: distribution-1.0.dist-info/ distribution-1.0.data/ The .data directory contains subdirectories with the scripts, headers, documentation and so forth from the distribution. During installation the contents of these subdirectories are moved onto their destination paths. Signed wheel files ------------------ Wheel files include an extended RECORD that enables digital signatures. PEP 376's RECORD is altered to include a secure hash ``digestname=urlsafe_b64encode_nopad(digest)`` (urlsafe base64 encoding with no trailing = characters) as the second column instead of an md5sum. All possible entries are hashed, including any generated files such as .pyc files, but not RECORD which cannot contain its own hash. For example:: file.py,sha256=AVTFPZpEKzuHr7OvQZmhaU3LvwKz06AJw8mT\_pNh2yI,3144 distribution-1.0.dist-info/RECORD,, The signature file(s) RECORD.jws and RECORD.p7s are not mentioned in RECORD at all since they can only be added after RECORD is generated. Every other file in the archive must have a correct hash in RECORD or the installation will fail. If JSON web signatures are used, one or more JSON Web Signature JSON Serialization (JWS-JS) signatures is stored in a file RECORD.jws adjacent to RECORD. JWS is used to sign RECORD by including the SHA-256 hash of RECORD as the signature's JSON payload:: { "hash": "sha256=ADD-r2urObZHcxBW3Cr-vDCu5RJwT4CaRTHiFmbcIYY" } If RECORD.p7s is used, it must contain a detached S/MIME format signature of RECORD. A wheel installer is not required to understand digital signatures but MUST verify the hashes in RECORD against the extracted file contents. When the installer checks file hashes against RECORD, a separate signature checker only needs to establish that RECORD matches the signature. See - http://self-issued.info/docs/draft-ietf-jose-json-web-signature.html - http://self-issued.info/docs/draft-jones-jose-jws-json-serialization.html - http://self-issued.info/docs/draft-ietf-jose-json-web-key.html - http://self-issued.info/docs/draft-jones-jose-json-private-key.html Comparison to .egg ------------------ #. Wheel is an installation format; egg is importable. Wheel archives do not need to include .pyc and are less tied to a specific Python version or implementation. Wheel can install (pure Python) packages built with previous versions of Python so you don't always have to wait for the packager to catch up. #. Wheel uses .dist-info directories; egg uses .egg-info. Wheel is compatible with the new world of Python packaging and the new concepts it brings. #. Wheel has a richer file naming convention for today's multi-implementation world. A single wheel archive can indicate its compatibility with a number of Python language versions and implementations, ABIs, and system architectures. Historically the ABI has been specific to a CPython release, wheel is ready for the stable ABI. #. Wheel is lossless. The first wheel implementation bdist_wheel always generates egg-info, and then converts it to a .whl. It is also possible to convert existing eggs and bdist_wininst distributions. #. Wheel is versioned. Every wheel file contains the version of the wheel specification and the implementation that packaged it. Hopefully the next migration can simply be to Wheel 2.0. #. Wheel is a reference to the other Python. FAQ === Wheel defines a .data directory. Should I put all my data there? This specification does not have an opinion on how you should organize your code. The .data directory is just a place for any files that are not normally installed inside ``site-packages`` or on the PYTHONPATH. In other words, you may continue to use ``pkgutil.get_data(package, resource)`` even though *those* files will usually not be distributed in *wheel's* ``.data`` directory. Why does wheel include attached signatures? Attached signatures are more convenient than detached signatures because they travel with the archive. Since only the individual files are signed, the archive can be recompressed without invalidating the signature or individual files can be verified without having to download the whole archive. Why does wheel allow JWS signatures? The JOSE specifications of which JWS is a part are designed to be easy to implement, a feature that is also one of wheel's primary design goals. JWS yields a useful, concise pure-Python implementation. Why does wheel also allow S/MIME signatures? S/MIME signatures are allowed for users who need or want to use existing public key infrastructure with wheel. Signed packages are only a basic building block in a secure package update system. Wheel only provides the building block. Appendix ======== Example urlsafe-base64-nopad implementation:: # urlsafe-base64-nopad for Python 3 import base64 def urlsafe_b64encode_nopad(data): return base64.urlsafe_b64encode(data).rstrip(b'=') def urlsafe_b64decode_nopad(data): pad = b'=' * (4 - (len(data) & 3)) return base64.urlsafe_b64decode(data + pad) Copyright ========= This document has been placed into the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
On Sat, Feb 16, 2013 at 2:21 PM, Daniel Holth <dholth@gmail.com> wrote:
#. Python scripts must appear in ``scripts`` and begin with exactly ``b'#!python'`` in order to enjoy script wrapper generation and ``#!python`` rewriting at install time. They may have any or no extension.
For compatibility with file encoding declarations, I believe this needs to be relaxed to starting with '#!python' in the source file encoding, rather than strictly b'#!python' (which will only be the case for ASCII compatible encodings). My rationale is that installers are going to need to read the source file encoding for the scripts anyway, otherwise they may write the shebang line back out with the wrong encoding, potentially leading to decoding errors when attempting to run the script.
#. ``{distribution}-{version}.dist-info/METADATA`` is Metadata version 1.1 or greater (PEP 314, PEP 345, PEP 426) format metadata.
I suggest removing the PEP references here and simply saying "is Metadata version 1.1 or greater format metadata"
#. ``Wheel-Version`` is the version number of the Wheel specification. ``Generator`` is the name and optionally the version of the software that produced the archive. ``Root-Is-Purelib`` is true if the top level directory of the archive should be installed into purelib; otherwise the root should be installed into platlib. ``Tag`` is the wheel's expanded compatibility tags; in the example the filename would contain ``py2.py3-none-any``. ``Build`` is the build number and is omitted if there is no build number.
I suggest breaking these out into separate bullet points (they're a bit hard to read as they stand) Aside from those minor issues, the current version of the spec looks fine to me - upload those fixes and I will accept it. If we later need to define wheel 1.1 or 2.0 to handle additional situations, well, that's why it's a versioned format :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, 16 Feb 2013 19:18:22 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sat, Feb 16, 2013 at 2:21 PM, Daniel Holth <dholth@gmail.com> wrote:
#. Python scripts must appear in ``scripts`` and begin with exactly ``b'#!python'`` in order to enjoy script wrapper generation and ``#!python`` rewriting at install time. They may have any or no extension.
For compatibility with file encoding declarations, I believe this needs to be relaxed to starting with '#!python' in the source file encoding, rather than strictly b'#!python' (which will only be the case for ASCII compatible encodings).
I may be wrong, but I am not aware that Python is able to read encoding declarations in a non-ASCII compatible encoding :-) Also, given the shebang line is for use by the OS, the appropriate encoding should be decided by the OS, not Python conventions. But I would surprised if a shebang-compatible used a non-ASCII encoding by default. Regards Antoine.
On Sat, 16 Feb 2013 11:12:49 +0100 Antoine Pitrou <solipsis@pitrou.net> wrote:
On Sat, 16 Feb 2013 19:18:22 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sat, Feb 16, 2013 at 2:21 PM, Daniel Holth <dholth@gmail.com> wrote:
#. Python scripts must appear in ``scripts`` and begin with exactly ``b'#!python'`` in order to enjoy script wrapper generation and ``#!python`` rewriting at install time. They may have any or no extension.
For compatibility with file encoding declarations, I believe this needs to be relaxed to starting with '#!python' in the source file encoding, rather than strictly b'#!python' (which will only be the case for ASCII compatible encodings).
I may be wrong, but I am not aware that Python is able to read encoding declarations in a non-ASCII compatible encoding :-)
Also, given the shebang line is for use by the OS, the appropriate encoding should be decided by the OS, not Python conventions. But I would surprised if a shebang-compatible used a non-ASCII encoding by default.
I mean non-ASCII compatible. Regards Antoine.
On Sat, Feb 16, 2013 at 8:12 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Sat, 16 Feb 2013 19:18:22 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote: I may be wrong, but I am not aware that Python is able to read encoding declarations in a non-ASCII compatible encoding :-)
Also, given the shebang line is for use by the OS, the appropriate encoding should be decided by the OS, not Python conventions. But I would surprised if a shebang-compatible used a non-ASCII encoding by default.
Oh, good point - which means the only comments are cosmetic ones, which I can fix while marking it accepted. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan writes:
For compatibility with file encoding declarations, I believe this needs to be relaxed to starting with '#!python' in the source file encoding, rather than strictly b'#!python' (which will only be the case for ASCII compatible encodings).
In any PEP-263-compatible encoding it will be b'#!python'. Relaxing this is excessive generality for a new feature. I'm not sure what you mean by file encoding declarations if not PEP 263, which requires approximate[1] ASCII compatibility. PEP 3120 simply builds on PEP 263 by making UTF-8, rather than ISO 8859/1, the default encoding.
My rationale is that installers are going to need to read the source file encoding for the scripts anyway, otherwise they may write the shebang line back out with the wrong encoding, potentially leading to decoding errors when attempting to run the script.
Too bad if there's no PEP 263 declaration and the file is not in ASCII. I.e., the intersection of Python 2 and Python 3 default encodings. Footnotes: [1] Ie, Shift JIS and Big 5, or any encoding in which a pure ASCII string can be interpreted as a string in that encoding, are OK, but UTF-16 is not.
Since Antoine and Stephen have pointed out my only non-cosmetic concern was an error on my part, I am accepting the PEP. I'll update the peps repo (including the cosmetic fixes) in a moment. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, Feb 16, 2013 at 9:17 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Since Antoine and Stephen have pointed out my only non-cosmetic concern was an error on my part, I am accepting the PEP. I'll update the peps repo (including the cosmetic fixes) in a moment.
And done: http://hg.python.org/peps/rev/d272d7a97e0c Thank you to Daniel for his hard work on getting this through to completion. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, 16 Feb 2013 21:27:28 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sat, Feb 16, 2013 at 9:17 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Since Antoine and Stephen have pointed out my only non-cosmetic concern was an error on my part, I am accepting the PEP. I'll update the peps repo (including the cosmetic fixes) in a moment.
And done: http://hg.python.org/peps/rev/d272d7a97e0c
Thank you to Daniel for his hard work on getting this through to completion.
Great! Here's hope for an improved Python 3.4 distutils experience:-) Regards Antoine.
participants (4)
-
Antoine Pitrou
-
Daniel Holth
-
Nick Coghlan
-
Stephen J. Turnbull