[Distutils] conventions or best practice to choose package names?

Benoît Bryon benoit at marmelune.net
Thu May 31 00:10:44 CEST 2012


Le 14/05/2012 13:12, Jim Fulton a écrit :
> +1 for an official document (or addition to an existinhg document)
>     providing a rational for namespace packages and their naming.
I opened a ticket on CPython issue tracker:

Then started to work in a fork:

.. but it looks like a PEP. So, I followed PEP 1
and posted the proposal to python-list at python.org...
where I've been told to post it to distutils-sig :)
So, I'm back here with a proposal...

The document below is the same as current version of

Thanks to Martin Aspeli for his article.
Thanks to early reviewers: Alexis Métaireau, Éric Bréhault,
Jean-Philippe Camguilhem and Mathieu Leplâtre.


Names in packaging: conventions and recipes

This document deals with:

* names of Python projects,
* names of distributions in projects,
* names of Python packages or modules being distributed,
* namespace packages.

It provides conventions and recipes for distribution authors.

Main use case is:

* as a developer, I want to create a project in order to distribute a package.
   So I have to choose names.  Which names are "good"?

* `The Zen of Python`_ says:

     In the face of ambiguity, refuse the temptation to guess.
     There should be one-- and preferably only one --obvious way to do it.

* So I need clear and official (i.e. obvious) guidelines or conventions that I
   can follow.

* Here are conventions, guidelines and recipes.

Guidelines for existing projects are also given.


First of all, let's make sure there is no confusion...

Distribution name
   Distribution name is used by packaging utilities:

   * in :doc:`setup script</packaging/setupscript>`, it is the value passed as
     ``name`` to ``packaging.core.setup()``
   * it appears on `PyPI`_ if the package is registered on it
   * it can be used in `pip` requirements files
   * it can be used in `buildout` configuration files.

   Distribution term is introduced in :doc:`packaging docs</packaging/index>`.

Egg name
   It is the same concept as distribution name. Technically, the egg is not the
   distribution. But they use the same name: it is declared as
   ``packaging.core.setup()`` name argument.

   "Egg" term is the legacy counterpart to "distribution". It was used by
   distutils and setuptools. It becomes deprecated with the new packaging

   This document focuses on distributions, not eggs.

Package and module names
   Package and module names are used in Python code. It is the string used in
   :ref:`import statements<import>`.

   Remember that, from a file system perspective, packages are directories and
   modules are files.

   :ref:`Python packaging allows distributions to distribute several packages
   and/or modules<setupcfg-section-files>`.

Project name
   Usually the name of the repository or folder in which distribution authors put
   their code. It generally is the directory name of the "distribution root",
   as defined in :ref:`packaging-term`.

Namespace packages
   It is common practice to use namespaces in package names. `PEP 420`_ brings
   this concept to core Python. When PEP 420 will be accepted, Python officially
   supports namespace packages.

As an example, consider `django-debug-toolbar`_:

* ``django-debug-toolbar`` is the distribution name. `It is declared in
   setup.py file

* ``debug_toolbar`` is the package name. It is what would appear in Django's
   INSTALLED_APPS setting or be used as ``import debug_toolbar``.

Technically, all those names can be different.


Relationship with other PEPs

* `PEP 8`_ deals with code style guide, including names of Python packages and
   modules. It covers syntax of package/modules names.

* `PEP 345`_ deals with packaging metadata, and defines distribution name.

* `PEP 420`_ deals with namespace packages. It brings support of namespace
   packages to Python core. Before, namespaces packages were implemented by
   external libraries.

* `PEP 3108`_ deals with transition between Python 2.x and Python 3.x applied
   to standard library: some modules to be deleted, some to be renamed.

Other sources of inspiration

* `Martin Aspeli's article about names`_. Some parts of this proposal are
   quotes from this article.

* `The Hitchhiker's Guide to Packaging`_, which has an empty placeholder
   for "naming specification".

* and, of course, `in development official packaging documentation`_.


Before Python version 3.3, there is no official guidelines to name projects,
distributions or packages/modules.
Current PEPs (see `Relationship with other PEPs`_) are very open on this topic.

Distribution authors have to follow their own intuition.

Several standards emerged from communities. As examples:

* `Plone`_ community uses "plone.*" namespace for official Plone products, and
   "collective.*" for community products. This is a convention explicitely
   promoted by Plone community.
   `Martin Aspeli's article about names`_ is about conventions and usages in
   Plone community.

   .. note:: Is there an official document about these conventions? A PLIP?

* most `Django`_ applications from community use "django-\*" pattern as
   distribution name. This is a de facto standard.

* many `Pyramid`_ applications from community use "pyramid_*" pattern as
   distribution name.

Thus, as `PyPI`_ testifies, distribution names and package/module names are
really heterogeneous.


Here are points this document tries to resolve.


When distribution authors come to choose a name, they can't find an unique
official guideline.
In such a situation, "Now is better than never" wins over "Refuse temptation
to guess".
So distribution authors follow one of the conventions they discovered (usually
the one from their community), or follow their own intuition.


As explained in `terminology`_ above, project, distribution and package names
can be assigned distinct values. That is a big source of confusion, especially
for Python developers who are not used to packaging.

Time loss

As a direct consequence of `ambiguity`_ and `confusion`_, new Python developers
spend too much time to understand Python projects/distributions/packages names:

* they can't find obvious (i.e. official) naming conventions (or at least
   guidelines) even if they search for it. They have to ask community to resolve
   the `ambiguity`_.

* it's hard to resolve the `confusion`_ between names. It's even harder because
   community itself is a bit confused. Their best chance is to find one of the
   `Other sources of inspiration`_ listed above or ask a well informed person.

* developers from some other languages suppose Python have official naming
   conventions for distributions and packages. So they search for it, and feel
   worried when they figure out that it doesn't exist.

Experienced Python users are less affected: they built their opinion in the
past and keep on following their habits.

Community partitionning

The global Python community is partitionned into opposed sub-communities:

* most Python developers are linked to at least one community (i.e. Zope,
   Plone, Pyramid, Django...).

* communities usually resolved naming conventions with official documents
   or with de facto usage.

* developers usually follow their community's standards.

* developers usually believe their community made the best choice. They usually
   adhere to community arguments.

* choices and reasons differ from one community to another.

* when Python communities meet, package names are a never-ending topic of

* people discuss about package names when they should work together on more
   valuable stories.

* they can't settle the issue, because:

   * arguments have historical reasons. In history, these reasons were enough.

   * accepting someone else's arguments means changing habits, and maybe
     re-packaging existing projects, i.e. efforts and time.

   * there is no guidelines from an higher authorithy (i.e. python.org). There
     is no comparison standard. Both choices are legitimate.

An additional note about developers who belong to several communities:

* they usually adhere to the naming conventions from one community,
* it's hard to adopt another convention when contributing in another community.


As `The Zen of Python`_ says: "There should be one-- and preferably only one
--obvious way to do it."

So the proposal is:

* adopt strict conventions where Python community finds a consensus,
* provide guidelines or recipes for what cannot be covered by conventions.

What about existing projects?

It's impossible to **require** a change for every existing project, for obvious

But it is possible to first **document** existing naming conventions, then
**promote** a change.

This document proposes two things:

* a status on current existing naming conventions, inside each project or
   community. So that custom naming conventions are at least self-documented.
   See `Organize community contributions`_ for details.

* a `Transition plan`_ for those who are ready to migrate.

.. _`packagenames-opportunity`:


As of Python 3.3 being developed:

* many projects are not Python 3.x compatible. It includes "big" products or
   frameworks. It means that many projects will have to do a migration to
   support Python 3.x.

* packaging (aka distutils2) is on the starting blocks. When it is released,
   projects will be invited to migrate and use new packaging.

* `PEP 420`_ brings official support of namespace packages to Python.

It means that most active projects should be about to migrate in the next
year(s) to support Python 3.x, new packaging or new namespace packages.

Such an opportunity is unique and won't come again soon!
So let's introduce and promote naming conventions as soon as possible (i.e.

Transition plan

New distributions

In order of priority:

1. If the project belongs to a community (i.e. product/framework), **and** the
    community have official conventions, then follow community conventions.

    .. note::

       :ref:`Communities SHOULD organize contributions

    As an example new community project related to Plone should be distributed
    as "collective.*", because it is an explicit standard of the Plone

2. New projects SHOULD follow `Conventions`_ described in this document.

Existing projects

**There is no obligation for existing distributions to be renamed**. The choice
is left to distribution authors and mainteners for obvious reasons.

However, distribution authors are invited to `promote migrations`_.

In order to rename an existing distribution, follow `Renaming howto`_
guidelines below.

Promote migrations

Every Python developer should migrate whenever possible, or promote the
migrations in their respective communities.

Apply this convention on your projects, then the community will see it is

In particular, "leaders" such as authors of popular projects are influential,
they have power and, thus, responsability over communities.

Apply this conventions on popular projects, then communities will adopt the
conventions too.

**Popular projects SHOULD promote migrations when they release a new (major)
version**, particularly :ref:`if this version introduces support for Python
3.x, new standard library's packaging or namespace packages

.. note::

    On the contrary, if popular projects refuse the conventions, communities
    may not adopt the conventions.

Improved handling of renamed distributions on PyPI

If many projects follow `Renaming howto`_, many legacy distributions will have
the following characteristics:

* ``Development Status :: 7 - Inactive`` classifier.
* latest version is empty, except packaging stuff.
* lastest version "redirects" to another distribution. E.g. it has a single
   dependency on the renamed distribution.
* referenced as ``Obsoletes-Dist`` in a newer distribution.

So it will be possible to detect renamed distributions and improve readability
on PyPI. So that users can focus on active distributions. But this feature is
not required now. There is no urge. It won't be covered in this document.


Rules that you SHOULD follow.

If in doubt, ask

If you feel unsure after reading the following conventions, ask `Python
community`_ on IRC or on a mailing list.

Use a single name

Distribute only one package (or only one module) in a distribution, and use
package (or module) name as project name and distribution name.

* It avoids possible confusion between all those names.
* It makes the name consistent.
* It is explicit: when one sees distribution name, he guesses package name, and
   vice versa.
* It also limits implicit clashes between package/module names.
   By using a single name, when you register a name to PyPI, you also perform a
   basic package/module name availability verification.

   As an example, `pipeline`_, `python-pipeline`_ and `django-pipeline`_ all
   distribute a package or module called "pipeline". So installing two of them
   leads to errors.


* Package name: "kheops.pyramid",
   i.e. ``import kheops.pyramid``

* distribution name: "kheops.pyramid",
   i.e. ``pip install kheops.pyramid``

* Project name: "kheops.pyramid",
   i.e. ``git clone git at github.com/pharaohs/kheops.pyramid.git``


* Package name: "kheops"
* Distribution name: "kheops-pyramid"
* Project name: "KheopsPyramid"

.. note::

    For historical reasons, on `PyPI`_, you can find many distributions using
    different values for project, distribution and package/module name.

Multiple packages/modules should be rare

Technically, Python distributions can provide multiple packages and/or modules.
See :ref:`setup script reference<packaging-setup-script>` for details.

Some distributions actually does.
As an example, `setuptools`_ and `distribute`_ are both declaring
"pkg_resources", "easy_install" and "site" modules in addition to respective
"setuptools" and "distribute" packages.

Consider this use case as exceptional. In most cases, you don't need this
feature. So a distribution should provide only one package or module at a time.

Explicit distinct names should be rare

A notable exception to the "Use a single name" rule is when you explicitely
need distinct names.

As an example, the `Pillow`_ distribution is an alternative to the original
`PIL`_ distribution. They both provide a "PIL" package.

Consider this use case as exceptional. In most cases, you don't need this
feature. So a distributed package name should be equal to distribution name.

Follow PEP 8 for package names syntax

`PEP 8`_ applies to Python package and module names.

If you `Use a single name`_, `PEP 8`_ also applies to project and distribution
names. The exceptions are namespace packages, where dots are required in the

Pick meaningful names

Ask yourself "how would I describe in one sentence what this name is for?", and
then "could anyone have guessed that by looking at the name?".

When you are using namespaces, make sure each part is meaningful.

.. _`packagenames-ownership`:

Top level namespace relates to code ownership

This helps avoid clashes between distribution names.

Ownership could be:

* an individual.
   Example: `gp.fileupload`_ is owned and maintained by Gael Pasgrimaud.

* an organization.

   * `zest.releaser`_ is owned and maintained by Zest Software.
   * `Django`_ is owned and maintained by the Django Software Fundation.

* a group or community.
   Example: `sphinx`_ is maintained by developers of the Sphinx project, not
   only by its author, Georg Brandl.

* a group or community related to another package.
   Example: `collective.recaptcha`_ is owned by its author: David Glick,
   Groundwire. But the "collective" namespace is owned by Plone community.

Respect ownership

Understand the purpose of namespace before you use it.

**DO NOT** plug into a namespace you don't own, unless explicitely authorized.

`If in doubt, ask`_.

As an example, **DO NOT** use "django.contrib" namespace: it is managed by
Django's core contributors.

Exceptions CAN be defined by distribution authors. See `Organize community
contributions`_ below.

Private (including closed-source) distributions use a namespace

... because private distributions are owned by somebody. So apply the
:ref:`ownership rule<packagenames-ownership>`.

For internal/customer projects, use your company name as the namespace.

This rule applies to closed-source distributions.

As an example, if you are creating a "climbing" distribution for the "Python
Sport" company: use "pythonsport.climbing" name, even if it is closed source.

Individual projects use a namespace

... because they are owned by individuals. So apply the :ref:`ownership rule

There is no shame in releasing a distribution as open source even if it has an
"internal" name.

If the project comes to a point where the author wants to change ownership
(i.e. the project no longer belongs to an individual), keep in mind :ref:`it is
easy to rename the project<packagenames-rename>`.

Independant community Python projects CAN avoid namespaces

If your project is generic enough (i.e. it is not a contrib to another product
or framework), you CAN avoid namespaces. The base condition is generally that
your project is owned by a group (i.e. the development team) which is dedicated
to this project.

Only use a "shared" namespace if you really intend the code to be community

As an example, `sphinx`_ project belongs to the Sphinx development team.

In doubt, use an individual/organization namespace

If your project is not mature or hasn't been proven useful to a community,
best choice is to use an individual or organization namespace.

It allows distribution authors to release projects early.

And it doesn't block future changes. When a project becomes mature, and if it
appears there is no reason to keep individual ownership, :ref:`it remains
possible to rename the project<packagenames-rename>`.

Avoid deep nesting

`The Zen of Python`_ says:

   Flat is better than nested.

Two levels is almost always enough

Don't define everything in deeply nested hierarchies: you will end up with
distributions and packages like "pythonsport.common.maps.forest". This type
of name is both verbose and cumbersome (e.g. if you have many imports from the
Furthermore, big hierarchies tend to break down over time as the boundaries
between different packages blur.

The consensus is that two levels of nesting are preferred.

Yes: "pyranha"

Yes: "pythonsport.climbing"

Yes: "pythonsport.forestmap"

No: "pythonsport.maps.forest"

.. _`packagenames-othermetadata`:

Limited namespace levels, unlimited metadata

Consider distribution names (with or without namespaces) as unique identifiers
on PyPI.
It is important that these identifiers remain human-readable.
It is even better when these identifiers are meaningful.
But their primary purpose is not to classify or describe distributions.

As examples, if you only look at the name:

* you can't guess "nose" is about testing,
* or "celery" about distributed task queueing,
* or that "lettuce" is about tests, and has nothing in common with "celery".

The examples above are not problematic.

**`Classifiers`_ and keywords metadata are made for categorization of

As an example, there is a "Framework :: TurboGears" classifier. Even if names
are quite heterogeneous (they don't follow a pattern like collective.* for
Plone community projects), we get the list.

In order to `Organize community contributions`_, conventions about names and
namespaces matter, but conventions about metadata should be even more

As an example, we can find Plone portlets in many places:

* plone.portlet.*
* collective.portlet.*
* collective.portlets.*
* collective.*.portlets
* some vendor-related distributions such as "quintagroup.portlet.cumulus"
* and even distributions where "portlet" pattern doesn't appear...

Even if Plone community has conventions, using the name to categorize
distributions is inapropriate. It's impossible to get the full list of
distributions that provide portlets for Plone by filtering on names.
But it would be possible if all these distributions used "Framework :: Plone"
classifier and "portlet" keyword.

Do you really need 3 levels?

For example, we have ``plone.principalsource`` instead of
``plone.source.principal`` or something like that. The name is shorter, the
package structure is simpler, and there would be very little to gain from
having three levels of nesting here. It would be impractical to try to put all
"core Plone" sources (a source is kind of vocabulary) into the
``plone.source.*`` namespace, in part because some sources are part of other
packages, and in part because sources already exist in other places. Had we
made a new namespace, it would be inconsistently used from the start.

3 levels are also tempting when:

* you are pluging into a community namespace, such as "collective".
* and you want to add a more restrictive "ownership" level, to avoid clashes
   inside the community.

In such a case, you'd better use the most restrictive ownership level as first

As an example, where "collective" is a major community namespace that
"gergovie" belongs to, and "vercingetorix" it the name of "gergovie" author:

No: "collective.vercingetorix.gergovie"

Yes: "vercingetorix.collectivegergovie"

3 levels are supported for historical reasons

Even if not recommended, 3 levels are supported. This is mainly for historical
reasons: 3 levels can be accepted where top level namespace owner explicitely
allows it with a specific convention. See `Organize community contributions`_
for details.

Don't use more than 3 levels

* 1 or 2 levels are recommended.
* 3 levels are discouraged, but supported for historical reasons.
* you shouldn't need more than 3 levels.

.. note::

    Even communities where namespaces are standard don't use more than 3 levels.

.. _`packagenames-organizecommunities`:

Organize community contributions


* Choose a naming convention for community contributions.

* If it is not :ref:`the default<packagenames-contribnamespace>`, document it.

   * if you use the :ref:`default convention<packagenames-contribnamespace>`,
     this document should be enough. Don't reapeat it. You MAY reference it.

   * else, tell users about custom conventions in project's "contribute" or
     "create modules" documentation.

* Also recommend the use of additional metadata, such as :ref:`classifiers and

About convention choices:

* New projects SHOULD choose the default scheme.

* Existing projects with community contributions CAN start with custom
   conventions. Then they SHOULD `Promote migrations`_.

   It means that existing community conventions doesn't need to be changed.
   But they need to be explicitely documented: first state about current naming
   conventions, then about future.

Example: "pyranha" is your project name, distribution name and package name.
Tell contributors that:

* pyranha-related distributions should use the "pyranha" keyword

* pyranha distributions providing templates should also use "templates"

* community contributions should be released under "pyranhacontrib" namespace
   (i.e. use "pyranhacontrib.*" pattern):

.. _`packagenames-contribnamespace`:

Community contributions SHOULD use "${DIST}contrib.*" pattern

The idea is to use a standard pattern to store community contributions for any
product or framework.

It is the simplest way to `Organize community contributions`_: the obvious way
to go is "${DIST}contrib", no ambiguity.

As an example:

* you are the author of "pyranha" project. You own the "pyranha" namespace.
* a third-party developer wants to publish a "giantteeth" project related to
   your "pyranha" project. He can publish it as "pyranhacontrib.giantteeth".

.. note::

    Why ``${DIST}contrib.*`` pattern?

    * ``${DIST}c.*`` is not explicit enough. As examples, "zc" belongs to
      "Zope Corporation" whereas "z3c" belongs to "Zope 3 community".

    * ``${DIST}community`` is too long.

    * ``${DIST}community`` conflicts with existing namespaces such as
      "iccommunity" or "PyCommunity".

    * ``${DIST}.contrib`` is inside ${DIST} namespace, i.e. it is owned by
      ${DIST} authors. It breaks the `Top level namespace relates to code

    * ``${DIST}.contrib.*`` breaks the `Avoid deep nesting`_ rule.

    * names where ``${DIST}`` doesn't appear are not explicit enough, i.e.
      nobody can guess they are related to ``${DIST}``.

    * ``{$DIST}contrib.*`` may conflict with existing ``sphinxcontrib-*``
      packages. But ``sphinxcontrib-*`` is actually about Sphinx contrib, so
      this is not a real conflict... In fact, the "contrib" suffix was inspired
      by "sphinxcontrib".


How to avoid duplicate names

Before you choose a distribution name, make sure it hasn't already been
registered in the following locations:

* `PyPI`_
* Popular code repositories such as:

   * `Github`_
   * `Bitbucket`_
   * `Gitorious`_

* `djangopackages.com`_

.. note:: A web service would be welcome for this!

Also make sure the package name hasn't already been registered:

* in the `Python Standard Library`_,
* in the locations where you checked for distribution name availability.

.. _`packagenames-rename`:

Renaming howto

Renaming a project is possible, but keep in mind that it will cause some
confusions. So, pay particular attention to README and documentation, so that
users understand what happened.

#. First of all, **do not remove legacy distribution from PyPI**. Because some
    users may be using it.

#. Copy the legacy project, then change names (project, distribution and
    package/module). Pay attention to, at least:

    * packaging files,
    * folder name that contains source files,
    * documentation, including README,
    * import statements in code.

#. Assign ``Obsoletes-Dist`` metadata to new distribution in setup.cfg file.
    See `PEP 345 about Obsolete-Dist`_ and :ref:`setup.cfg specification

#. Release the renamed distribution as a new version, then publish it.

#. Edit legacy distribution:

    * add dependency to new distribution,
    * drop everything except packaging stuff,
    * add the ``Development Status :: 7 - Inactive`` classifier in setup script,
    * publish a new release.

So, users of the legacy package:

* can continue using the legacy distribution at a deprecated version,
* can upgrade to last version of legacy distribution, which is empty, ...
* ... and automatically download new distribution as a dependency of the legacy

Users who discover the legacy distribution see it is inactive.


.. target-notes::

.. _`Martin Aspeli's article about names`:
.. _`PEP 1`: http://www.python.org/dev/peps/pep-0001/
.. _`The Zen of Python`: http://www.python.org/dev/peps/pep-0020/
.. _`PEP 8`: http://www.python.org/dev/peps/pep-0008/#package-and-module-names
.. _`PEP 345`: http://www.python.org/dev/peps/pep-0345/
.. _`PEP 420`: http://www.python.org/dev/peps/pep-0420/
.. _`PEP 3108`: http://www.python.org/dev/peps/pep-3108
.. _`The Hitchhiker's Guide to Packaging`:
.. _`in development official packaging documentation`:
.. _`plone`: http://plone.org/community/develop
.. _`django`: http://djangoproject.com/
.. _`pyramid`: http://pylonsproject.org
.. _`pypi`: http://pypi.python.org
.. _`django-debug-toolbar`:
.. _`gp.fileupload`: http://pypi.python.org/pypi/gp.fileupload/
.. _`zest.releaser`: http://pypi.python.org/pypi/zest.releaser/
.. _`sphinx`: http://sphinx.pocoo.org
.. _`Classifiers`: http://pypi.python.org/pypi?:action=list_classifiers
.. _`collective.recaptcha`: http://pypi.python.org/pypi/collective.recaptcha/
.. _`Python community`: http://www.python.org/community/
.. _`pipeline`: http://pypi.python.org/pypi/pipeline/
.. _`python-pipeline`: http://pypi.python.org/pypi/python-pipeline/
.. _`django-pipeline`: http://pypi.python.org/pypi/django-pipeline/
.. _`setuptools`: http://pypi.python.org/pypi/setuptools
.. _`distribute`: http://packages.python.org/distribute/
.. _`Pillow`: http://pypi.python.org/pypi/Pillow/
.. _`PIL`: http://pypi.python.org/pypi/PIL/
.. _`Python Standard Library`: http://docs.python.org/library/index.html
.. _`github`: https://github.com
.. _`bitbucket`: https://bitbucket.org
.. _`gitorious`: https://gitorious.org/
.. _`djangopackages.com`: http://djangopackages.com
.. _`PEP 345 about Obsolete-Dist`:

More information about the Distutils-SIG mailing list