[Python-Dev] PEP 441 - Improving Python ZIP Application Support

Paul Moore p.f.moore at gmail.com
Tue Feb 17 00:21:55 CET 2015


On 15 February 2015 at 17:46, Daniel Holth <dholth at gmail.com> wrote:
> Go ahead, make my pep.
>
> I will appreciate seeing it happen.

Here is a draft update for PEP 441. It's still a work in progress - in
particular I want to wait for consensus on the issue of the default
interpreter before finalising it. But I thought it would be worth
having a full spec available for people.

PEP: 441
Title: Improving Python ZIP Application Support
Version: $Revision$
Last-Modified: $Date$
Author: Daniel Holth <dholth at gmail.com>,
        Paul Moore <p.f.moore at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30 March 2013
Post-History: 30 March 2013, 1 April 2013, 16 February 2015

Improving Python ZIP Application Support
========================================

Python has had the ability to execute directories or ZIP-format
archives as scripts since version 2.6 [1]_.  When invoked with a zip
file or directory as its first argument the interpreter adds that
directory to sys.path and executes the __main__ module.  These
archives provide a great way to publish software that needs to be
distributed as a single file script but is complex enough to need to
be written as a collection of modules.

This feature is not as popular as it should be mainly because it was
not promoted as part of Python 2.6 [2]_, so that it is relatively
unknown, but also because the Windows installer does not register a
file extension (other than .py) for this format of file, to associate
with the launcher.

This PEP proposes to fix these problems by re-publicising the feature,
defining the .pyz and .pyzw extensions as “Python ZIP Applications”
and “Windowed Python ZIP Applications”, and providing some simple
tooling to manage the format.

A New Python ZIP Application Extension
======================================

The Python 3.5 installer will associate .pyz and .pyzw “Python ZIP
Applications” with the platform launcher so they can be executed.  A
.pyz archive is a console application and a .pyzw archive is a
windowed application, indicating whether the console should appear
when running the app.

For UNIX users, .pyz applications should typically be prefixed with a
#! line pointing to the correct Python interpreter and an optional
explanation::

    #!/usr/bin/env python3
    #  Python application packed with zipapp module
    (binary contents of archive)

However, it is always possible to execute a .pyz application by
supplying the filename to the Python interpreter directly.

As background, ZIP archives are defined with a footer containing
relative offsets from the end of the file.  They remain valid when
concatenated to the end of any other file.  This feature is completely
standard and is how self-extracting ZIP archives and the bdist_wininst
installer format work.

Minimal Tooling: The zipapp Module
==================================

This PEP also proposes including a module for working with these
archives.  The module will contain functions for working with Python
zip application archives, and a command line interface (via ``python
-m zipapp``) for their creation and manipulation.

Module Interface
----------------

The zipapp module will provide the following functions:

``pack(target, directory, interpreter=None, main=None)``

Writes an application archive called ``target``, containing the
contents of ``directory``.  If ``interpreter`` is specified, it will
be written to the start of the archive as a shebang line and the file
will be made executable (if no interpreter is specified, the shebang
line will be omitted).  If the directory contains no ``__main__.py``
file, the function will construct a ``__main__.py`` which calls the
function specified in the ``main`` argument (which should be in the
form ``"pkg.mod:fn"``).

It is an error to specify ``main`` if the directory contains a
``__main__.py``, or to omit ``main`` when there is no ``__main__.py``
(as that will result in an archive which has no main function and so
cannot be executed).

``get_interpreter(archive)``

Returns the interpreter specified in the shebang line of the archive.
If there is no shebang, the function returns None.

``set_interpreter(archive, new_archive, interpreter=None)``

Modifies the archive's shebang line to contain the specified
interpreter, and writes the updated archive to new_archive.  If the
interpreter is None, removes the shebang line.

Command Line Usage
------------------

The zipapp module can be run with the python -m flag.  The command
line interface is as follows::

    python -m zipapp [options] directory

        Create an archive from the contents of the given directory. By
        default, an archive will be created with the same name as the
        source directory, with a .pyz extension.

        The following options can be specified:

        -o archive

            The destination archive will have the specified name.

        -p interpreter

            The given interpreter will be written to the shebang line
            of the archive. If this option is not given, the archive
            will have no shebang line.

        -m pkg.mod:fn

            The source directory must not have a __main__.py file. The
            archiver will write a __main__.py file into the target
            which calls fn from the module pkg.mod.

The behaviour of the command line interface matches that of
``zipapp.pack()``.

As noted, the archives are standard zip files, and so can be unpacked
using any standard ZIP utility or Python’s zipfile module.

FAQ
---

Are you sure a standard ZIP utility can handle #! at the beginning?
    Absolutely.  The zipfile specification allows for arbitrary data to
    be prepended to a zipfile.  This feature is commonly used by
    "self-extracting zip" programs.  If your archive program can't
    handle this, it is a bug in your archive program.

Isn’t zipapp just a very thin wrapper over the zipfile module?
    Yes.  If you prefer to build your own Python zip application
    archives using other tools, they will work just as well.  The
    zipapp module is a convenience, nothing more.

Why not use just use a .zip or .py extension?
    Users expect a .zip file to be opened with an archive tool, and
    expect a .py file to contain readable text.  Both would be
    confusing for this use case.

How does this compete with existing package formats?
    The sdist, bdist and wheel formats are designed for packaging of
    modules to be installed into an existing Python installation.
    They are not intended to be used without installing.  The
    executable zip format is specifically designed for standalone use,
    without needing to be installed.  They are in effect a multi-file
    version of a standalone Python script.

Rejected Proposals
==================

Convenience Values for Shebang Lines
------------------------------------

Is it worth having "convenience" forms for any of the common
interpreter values? For example, ``-p 3`` meaning the same as ``-p
"/usr/bin/env python3"``.  It would save a lot of typing for the
common cases, as well as giving cross-platform options for people who
don't want or need to understand the intricacies of shebang handling
on "other" platforms.

Downsides are that it's not obvious how to translate the
abbreviations.  For example, should "3" mean "/usr/bin/env python3",
"/usr/bin/python3", "python3", or something else?  Also, there is no
obvious short form for the key case of "/usr/bin/env python" (any
available version of Python), which could easily result in scripts
being written with overly-restrictive shebang lines.

Overall, this seems like there are more problems than benefits, and as
a result has been dropped from consideration.

Open Questions
==============

Default Interpreter
-------------------

The initial draft of this PEP proposed using ``/usr/bin/env python``
as the default interpreter.  Unix users have problems with this
behaviour, as the default for the python command on many distributions
is Python 2, and it is felt that this PEP should prefer Python 3 by
default.  However, using a command of ``python3`` can result in
unexpected behaviour for Windows users, where the default behaviour of
the launcher for the command "python" is commonly customised by users,
but the behaviour of "python3" may not be modified to match.

Currently, the principle "in the face of ambiguity, refuse to guess"
has been invoked, and archives have no shebang line unless explicitly
requested.  On Windows, the archives will still be run (with the
default Python) by the launcher, and on Unix, the archives can be run
by explicitly invoking the desired Python interpreter.

This issue is currently under active discussion on python-dev, and the
results will be reflected here when consensus has been reached.

Command Line Tool to Manage Shebang Lines
-----------------------------------------

It is conceivable that users would want to modify the shebang line for
an existing archive, or even just display the current shebang line.
This is tricky to do so with existing tools (zip programs typically
ignore prepended data totally, and text editors can have trouble
editing files containing binary data).

The zipapp module provides functions to handle the shebang line, but
should these be exposed via the command line interface?

At the moment, the PEP proposes *not* to provide a command line
interface for these functions, as it is not clear how to provide one
without the resulting interface being over-complex and potentially
confusing.

References
==========

.. [1]  “Allow interpreter to execute a zip file”
   (http://bugs.python.org/issue1739468)

.. [2] “Feature is not documented”
   (http://bugs.python.org/issue17359)

Copyright
=========

This document has been placed into the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


More information about the Python-Dev mailing list