[Python-Dev] PEP 423 : naming conventions and recipes related to packaging

Benoît Bryon benoit at marmelune.net
Fri Jul 5 09:38:36 CEST 2013


Hi!

Attached is a an updated proposal for PEP 423.
You can also find it online at https://gist.github.com/benoitbryon/2815051

I am attending at EuroPython 2013 in Florence. Isn't it a great 
opportunity to get feedback and discuss about a PEP? I registered an 
open-space session and a lightning-talk today!

Some notes about the update...

The main point that was discussed in the previous proposal was the 
"top-level namespace relates to code ownership rule". Here is a quote 
from Antoine Pitrou:

Le 27/06/2012 12:50, Antoine Pitrou a écrit :
> On Wed, 27 Jun 2012 11:08:45 +0200
> Benoît Bryon<benoit at marmelune.net>  wrote:
>> Hi,
>>
>> Here is an informational PEP proposal:
>> http://hg.python.org/peps/file/52767ab7e140/pep-0423.txt
>>
>> Could you review it for style, consistency and content?
> There is one Zen principle this PEP is missing:
>
> Flat is better than nested.
>
> This PEP seems to promote the practice of having a top-level namespace
> denote ownership. I think it should do the reverse: promote
> meaningful top-level packages (e.g. "sphinx") as standard practice, and
> allow an exception for when a piece of software is part of a larger
> organizational body.

So, the main change in the proposal I'm sending today is the removal of 
this "ownership" rule.
It has been replaced by "Use a single namespace (except special cases)".

Some additional changes have been performed, such as removal of some 
sections about "opportunity" or "promote migrations". I also added a 
"Rationale" section where I pointed out some issues related to naming.

The PEP has been marked as "deferred" because it was inactive and it is 
partly related to PEP 426. I left this deferred state.

I am aware that some links in the PEP are broken... I will fix them 
later. My very first motivation is to get feedback about the "big" 
changes in the PEP. I wanted the update to be sent before 
EuroPython-2013's open-space session. I guess a detailed review would be 
nice anyway, for links, style, grammar...

Also, I wonder whether the PEP could be shortened or not. Sometimes I 
cannot find straightforward words to explain things, so perhaps someone 
with better skills in english language could help. Or maybe some parts, 
such as the "How to rename a project" section, could be moved in other 
documents.

Regards,

Benoît
-------------- next part --------------
PEP: 423
Title: Naming conventions and recipes related to packaging
Version: $Revision$
Last-Modified: $Date$
Author: Benoît Bryon <benoit at marmelune.net>
Discussions-To: <distutils-sig at python.org>
Status: Deferred
Type: Informational
Content-Type: text/x-rst
Created: 24-May-2012
Post-History: 5-Jul-2013


Abstract
========

This document deals with:

* names of Python projects,
* names of Python packages or modules being distributed,
* namespace packages.

It provides guidelines and recipes for distribution authors:

* new projects should follow the `guidelines <#overview>`_ below.

* existing projects should be aware of these guidelines and can
  follow `specific recipes for existing projects
  <#how-to-apply-naming-guidelines-on-existing-projects>`_.


PEP Deferral
============

Further consideration of this PEP has been deferred at least until
after PEP 426 (package metadata 2.0) and related updates have been
resolved.


Rationale: issues related to names
==================================

For a long time, there have been no official reference on the "how to
choose names" topic in the Python community. As a consequence, the
Python package index (`PyPI`_) contains many naming patterns.

The fact is that this heterogeneity causes some issues. Some of them
are described below.

.. note:: Examples were taken on July 2013.

Naming things is a hard task, and naming Python projects or packages
is not an exception. The purpose of this PEP is to help project
authors to avoid common traps about naming, and focus on valuable
things.

Clashes
-------

Projects names are unique on `PyPI`_. But names of distributed things
(packages, modules) are not. And there are clashes.

As an example, "pysendfile" and "django-sendfile" projects both
distribute a "sendfile" package. Users cannot use both in an
environment.

Deep nested hierarchies
-----------------------

Deep nested namespaces mean deep nested hierarchies. It obfuscates
valuable project contents.

As an example, with "plone.app.content" you get a deeply nested
directory hierarchy:

.. code:: text

   plone/
   └── app/
       └── command/
           └── ... valuable code is here...

Whereas, with flat packages like "sphinx", you have valuable
code near the top-level directory:

.. code:: text

   sphinx/
   └── ... valuable code is here...

Unrelated namespaces
--------------------

When project names are made of nested namespaces, and these
namespaces are not strongly related, then there is confusion.

As an example, it is not obvious that "zc.rst2" project is a general
Python project, not tied to "zc" (Zope Corporation), but related to
docutils' reStructuredText. As a consequence, some users discard
"zc.rst2" project, because they think it is specific to "zc".

This issue occurs with branded namespaces, i.e. when top-level
namespace relates to an organization and various projects are put
into this namespace.

This issue also occurs when namespaces are used for categorization.

Inconsistent names
------------------

When project and distributed packages do not share a single name,
users have to think when they use names (install project or import
package).

As an example, which package does "django-pipeline" project
distributes? Is it "djangopipeline", "django_pipeline" or "pipeline"?
The answer is not obvious and the pattern varies depending on
project. Users have to remember or search for the name to use.

There is no obvious bijection between project name and
package/module name. It means that a user have to remember at least
two names (project and package) whereas one could be enough.

This fact is a cause of clashes, because when you register a project
name on `PyPI`_, you have no idea about package name clashes, because
package names are not predictable.


Terminology
===========

Reference of terminology used in this PEP is `packaging terminology
in Python documentation`_.


Relationship with other PEPs
============================

* `PEP 8`_ deals with code style guide, including names of Python
  packages and modules. It covers syntax of package/modules names.

* `PEP 426`_ deals with packaging metadata, and defines name argument
  of the ``packaging.core.setup()`` function.

* `PEP 420`_ deals with namespace packages. It brings support of
  namespace packages to Python core. Before, namespaces packages were
  implemented by external libraries.

* `PEP 3108`_ deals with transition between Python 2.x and Python 3.x
  applied to standard library: some modules to be deleted, some to be
  renamed. It points out that naming conventions matter and is an
  example of transition plan.


Overview
========

Here is a summarized list of recommendations for you to choose names:

* `understand and respect namespace ownership
  <#understand-and-respect-namespace-ownership>`_.

* if your project is related to another project or community, first
  search for conventions in main project's documentation, then:

  * `follow specific project or related community conventions
    <#follow-community-or-related-project-conventions-if-any>`_, if
    any.

  * else (there is no specific convention), `follow a standard naming pattern
    <#use-standard-pattern-for-community-contributions>`_.

* make sure names are unique, i.e. avoid duplicates:

  * `check for name availability`_,
  * `register names with PyPI`_.

  Exception is when you explicitely want to distribute alternatives
  to existing packages or modules.

* `Use a single name`_. It implies `a project distributes a single
  package or module <#multiple-packages-modules-should-be-rare>`_

* `distribute only one package or module at a time
  <#multiple-packages-modules-should-be-rare>`_, unless you are in
  a special case.

* make it easy to discover and remember your project:
 
  * `pick memorable names`_,
  * `pick meaningful names`_,
  * `use packaging metadata`_.

* `avoid deep nesting`_:
 
  * one single level is the recommended way to go.

  * two levels can be used to point out strict relationships: the
    second level is specific to the first one. Main use cases are
    community contributions related to one project, and
    vendor specific projects.

  * you should not need more than two levels. Having more than three
    levels is strongly discouraged.

* `follow PEP 8`_ for syntax of package and module names.

* if, for some reason, your project does not follow the
  recommendations above, `document specific naming policy`_.
  In particular, projects which are receiving community
  contributions should `organize community contributions`_.

* `if still in doubt, ask <#if-in-doubt-ask>`_.


If in doubt, ask
================

If you feel unsure after reading this document, ask `Python
community`_ on IRC or on a mailing list.


Understand and respect namespace ownership
==========================================

On `PyPI`_, all projects are put at index root. There is no owner or
user level. PyPI cannot host two projects with the exact same name,
even if owners are different. One name relies to one project, which
relies to one ownership.

.. note:: A project's ownership can be hold by several users.

The top-level namespace relates to ownership.
As an example, `Django`_ is owned and maintained by the Django
Software Fundation.

Understand the purpose of namespace before you use it.

Do not plug into a namespace you do not own, unless explicitely
authorized. `If in doubt, ask`_.
As an example, do not plug in "django.contrib" namespace because it
is managed by Django's core contributors.

Project owners may define exceptions. See `Organize community
contributions`_ below.
As an example, `flask`_ project explicitely invites contributors to
release projects in "flask.ext" namespace.

Also, whenever possible, try to consider non-Python projects.
As an example, you should not use "apache" as top-level namespace:
"Apache" is the name of another (non Python) project. This is more an
advice than a strict rule, but it could help identify your project on
the internet or prevent some trademark issues.

Private projects may use a namespace
------------------------------------

For internal/customer projects, feel free to use the company or main
project name as the namespace. But keep in mind that, if a project is
general purpose (i.e. not specific to your company or to some main
project), then one level should be enough.

This rule applies to closed-source projects.

As an example, if you are creating a "climbing" project that is
specific to the "Python Sport" company: you may use
"pythonsport.climbing" name, even if it is closed source.

Use a single name
=================

Distribute only one package (or only one module) per project, and use
package (or module) name as project name.

* It avoids possible confusion between project name and distributed
  package or module name.

* It makes the name consistent.

* It is explicit: when one sees project name, he guesses
  package/module name, and vice versa.

* It also limits implicit clashes between package/module names.
  By using a single name, when you register a project name to
  `PyPI`_, you also perform a basic package/module name availability
  verification.

  As an example, `pipeline`_, `python-pipeline`_ and
  `django-pipeline`_ all distribute a package or module called
  "pipeline". So installing two of them leads to errors. This issue
  wouldn't have occurred if these distributions used a single name.

* As a bonus, it allows easier setup of the project: you provide the
  name of the package/module you want to distribute, and you get
  other names.

Example:

* Yes: Package name is "kheops" and project name is "kheops".

* Yes: Package name is "kheops.history", i.e.
  ``import kheops.history`` and project name is "kheops.history",
  i.e. ``pip install kheops.history``.

* No: Package name is "kheops" and project name is "KheopsPyramid".

.. note::

   For historical reasons, `PyPI`_ contains many distributions where
   project and distributed package/module names differ.

Multiple packages/modules should be rare
----------------------------------------

Technically, Python distributions can provide multiple packages
and/or modules. See `setup script reference`_ for details.

Some distributions actually do.
As an example, `setuptools`_ and `distribute`_ are both declaring
"pkg_resources", "easy_install" and "site" modules in addition to
respective "setuptools" and "distribute" packages.

Consider this use case as exceptional. In most cases, you do not need
this feature and distributing a single package or module is enough.

Distinct names should be rare
-----------------------------

A notable exception to the `Use a single name`_ rule is when you
explicitely need distinct names.

As an example, the `Pillow`_ project provides an alternative to the
original `PIL`_ distribution. Both projects distribute a "PIL"
package.

Consider this use case as exceptional. In most cases, you don't need
this feature and naming the project after the distributed
package/module is enough.

Follow PEP 8
============

`PEP 8`_ applies to names of Python packages and modules.

If you `Use a single name`_, `PEP 8`_ also applies to project names.
The exceptions are namespace packages, where dots are required in
project name.


Pick memorable names
====================

One important thing about a project name is that it be memorable.

As an example, `celery`_ is not a meaningful name. At first, it is
not obvious that it deals with message queuing. But it is memorable
because it can be used to feed a `RabbitMQ`_ server.


Pick meaningful names
=====================

Ask yourself "how would I describe in one sentence what this name is
for?", and then "could anyone have guessed that by looking at the
name?".

As an example, `DateUtils`_ is a meaningful name. It is obvious that
it deals with utilities for dates.

When you are using namespaces, try to make each part meaningful.

.. note::

   Sometimes, you cannot find a name that is both memorable and
   meaningful. In such a situation, consider "memorable" feature is
   more important than "meaningful" one. Generally, by choosing a
   memorable name, you tend to make the name unique, and simplicity
   is a key of success. Whereas a meaningful name tend to be
   similar to names of other projects that deal with same concepts,
   and you tend make the name out of keywords/buzzwords.


Use packaging metadata
======================

Consider that **project names are unique identifiers on PyPI**, i.e.
their primary purpose is to identify, not to classify or describe.

**Classifiers and keywords metadata are made for categorization.**
Summary and description metadata are meant to describe the project.

As an example, there is a "`Framework :: Twisted`_" classifier. Names
of projects that have this classifier are quite heterogeneous. They
do not follow a particular pattern to claim relation with Twisted.
But we get the list using the classifier and that is fine.

In order to `Organize community contributions`_, conventions about
names and namespaces matter, but conventions about metadata are
important too.

As an example, we can find Plone portlets in many places:

* plone.portlet.*
* collective.portlet.*
* collective.portlets.*
* collective.*.portlets
* some vendor-related projects such as "quintagroup.portlet.cumulus"
* and even projects where "portlet" pattern doesn't appear in the
  name.

Even if Plone community has conventions, using the name to categorize
distributions is inapropriate. It's impossible to get the full list
of distributions that provide portlets for Plone by filtering on
names. But it would be possible if all these distributions used
"Framework :: Plone" classifier and "portlet" keyword.

When you release a project on `PyPI`_, you obviously want your
project to be visible, and findable. Keep in mind that the name is
not the only mean to make a project discoverable. If you do care
about your project's visibility, take care of package metadata:
including keywords, classifiers, README... And you may also take care
of project's documentation and some stuff not related to packaging.


Avoid deep nesting
==================

`The Zen of Python`_ says:

  Flat is better than nested.

A single level is recommended
-----------------------------

In most cases, one level is enough. So, unless you are in a special
situation mentioned below, your project name should be made of a
single namespace.

Lower levels indicate strict relationship to upper levels
---------------------------------------------------------

In nested namespaces, lower levels point out strict relationship to
higher ones. It means the second level is specific to the first one.

Main use cases are community contributions related to one project
and vendor specific (mostly private) projects.

Two levels is almost always enough
----------------------------------

Don't define everything in deeply nested hierarchies: you will end up
with projects and packages like "pythonsport.common.maps.forest".
This type of name is both verbose and cumbersome (e.g. if you have
many imports from the package).

Furthermore, big hierarchies tend to break down over time as the
boundaries between different packages blur.

The consensus is that two levels of nesting are preferred.

For example, we have ``plone.principalsource`` instead of
``plone.source.principal`` or something like that. The name is
shorter, the package structure is simpler, and there would be very
little to gain from having three levels of nesting here. It would be
impractical to try to put all "core Plone" sources (a source is kind
of vocabulary) into the ``plone.source.*`` namespace, in part because
some sources are part of other packages, and in part because sources
already exist in other places. Had we made a new namespace, it would
be inconsistently used from the start.

Yes: "pythonsport.climbing"

Yes: "pythonsport.forestmap"

No: "pythonsport.maps.forest"

Do not use namespace levels for categorization
----------------------------------------------

`Use packaging metadata`_ instead.

Don't use more than 3 levels
----------------------------

Technically, you have the ability to create deeply nested
hierarchies. However, it is strongly discouraged.


Document specific naming policy
===============================

A project that does not follow this PEP's recommendations should
mention and explain it in documentation.

This rule is the simplest way to make an existing project comply with
this PEP, without a rename.


Follow community or related project conventions, if any
=======================================================

Projects or related communities can have specific naming conventions,
which may differ from those explained in this document. Specific
conventions override this PEP.

This rule exists for backward-compatibility purpose: new projects
should follow this PEP's conventions.

In such a case, `they should declare specific conventions in
documentation <#organize-community-contributions>`_.

So, if your project belongs to another project or to a community,
first look for specific conventions in main project's documentation.

If there is no specific conventions, follow the ones declared in this
document.

As an example, `Plone community`_ releases community contributions in
the "collective" namespace package. It differs from the `standard
namespace for contributions
<#use-standard-pattern-for-community-contributions>`_ proposed here.
But since it is documented, there is no ambiguity and you should
follow this specific convention.


Use standard pattern for community contributions
================================================

When no specific rule is defined, use the
``{MAINPROJECT}contrib.{PROJECT}`` pattern to store community
contributions for any product or framework, where:

* ``{MAINPROJECT}`` is the name of the related project. "pyranha" in
  the example below.

* ``{PROJECT}`` is the name of your project. "giantteeth" in the
  example below.

As an example:

* you are the author of "pyranha" project.

* you didn't defined specific naming conventions for community
  contributions.

* a third-party developer wants to publish a "giantteeth" project
  related to your "pyranha" project in a community namespace. So he
  should publish it as "pyranhacontrib.giantteeth".

It is the simplest way to `Organize community contributions`_.

.. note::

   Why ``{MAINPROJECT}contrib.*`` pattern?

   * ``{MAINPROJECT}c.*`` is not explicit enough. As examples, "zc"
     belongs to "Zope Corporation" whereas "z3c" belongs to "Zope 3
     community".

   * ``{MAINPROJECT}community`` is too long.

   * ``{MAINPROJECT}community`` conflicts with existing namespaces
     such as "iccommunity" or "PyCommunity".

   * ``{MAINPROJECT}.contrib.*`` is inside {MAINPROJECT} namespace,
     i.e. it is owned by ${MAINPROJECT} authors.

   * ``{MAINPROJECT}.contrib.*`` breaks the `Avoid deep nesting`_
     rule.

   * names where ``{MAINPROJECT}`` does not appear are not explicit
     enough, i.e. nobody can guess they are related to
     ``{MAINPROJECT}``. As an example, it is not obvious that
     "collective.*" belongs to Plone community.

   * ``{DIST}contrib.*`` looks like existing ``sphinxcontrib-*``
     packages. But ``sphinxcontrib-*`` is actually about Sphinx
     contrib, so this is not a real conflict... In fact, the
     "contrib" suffix was inspired by "sphinxcontrib".


Organize community contributions
================================

This is the counterpart of the `follow community conventions
<#follow-community-or-related-project-conventions-if-any>`_ and
`standard pattern for contributions
<#use-standard-pattern-for-community-contributions>`_ rules.

Actions:

* Choose a naming convention for community contributions.

* If it is not `the default
  <#use-standard-pattern-for-community-contributions>`_, then
  document it.

  * if you use the `default convention
    <#use-standard-pattern-for-community-contributions>`_, then this
    document should be enough. Don't reapeat it. You may reference
    it.

  * else, tell users about custom conventions in project's
    "contribute" or "create modules" documentation.

* Also recommend the use of additional metadata, such as
  `classifiers and keywords <#use-packaging-metadata>`_.

Example: "pyranha" is your project name and package name.
Tell contributors that:

* pyranha-related distributions should use the "pyranha" keyword

* pyranha-related distributions providing templates should also use
  "templates" keyword.

* community contributions should be released under "pyranhacontrib"
  namespace (i.e. use "pyranhacontrib.*" pattern).


Register names with PyPI
========================

`PyPI`_ is the central place for distributions in Python community.
So, it is also the place where to register project and package names.

See `Registering with the Package Index`_ for details.


Check for name availability
===========================

Make sure it project name has not already been registered on `PyPI`_.

.. note::

   `PyPI`_ is the only official place where to register names.

Also make sure the names of distributed packages or modules have not
already been registered:

* in the `Python Standard Library`_.
* inside projects at `PyPI`_.

.. note::

   The `use a single name`_ rule helps you avoid clashes with package
   names: if a project name is available, then the package name has
   good chances to be available too.


How to rename a project?
========================

Renaming a project is possible, but it should be done with care.
Pay particular attention to README and documentation, so that users
understand what happened.

#. First of all, **do not remove legacy distributions from PyPI**.
   Because some users may be using them.

#. Copy the legacy project, then change names (project and
   package/module). Pay attention to, at least:

   * packaging files,
   * folder name that contains source files,
   * documentation, including README,
   * import statements in code.

#. Assign ``obsoleted_by`` metadata to new distribution in setup.cfg
   file. See `PEP 426 about obsoleted_by`_ and `setup.cfg
   specification`_.

#. Release a new version of the renamed project, then publish it.

#. Edit legacy project:

   * add dependency to new project,
   * drop everything except packaging stuff,
   * add the ``Development Status :: 7 - Inactive`` classifier in
     setup script,
   * publish a new release.

So, users of the legacy package:

* can continue using the legacy distributions at a deprecated
  version,
* can upgrade to last version of legacy distribution, which is
  empty...
* ... and automatically download new distribution as a dependency of
  the legacy one.

Users who discover the legacy project see it is inactive.


How to apply naming guidelines on existing projects?
====================================================

**There is no obligation for existing projects to be renamed**. The
choice is left to project authors and maintainers for obvious
reasons.

However, project authors are invited to:

* at least, state about current naming, i.e., if necessary

 * explain the reason why the project has this name in documentation,
 * `document specific naming policy`_.

* optionally, `rename existing project or distributed
  packages/modules <#how-to-rename-a-project>`_.

Projects that are meant to receive contributions from community
should also `organize community contributions`_.


References
==========

Additional background:

* `Martin Aspeli's article about names`_. Some parts of this document
  are quotes from this article.

* `in development official packaging documentation`_.

* `The Hitchhiker's Guide to Packaging`_, which has an empty
  placeholder for "naming specification".

References and footnotes:

.. _`packaging terminology in Python documentation`:
   http://docs.python.org/dev/packaging/introduction.html#general-python-terminology
.. _`PEP 8`:
   http://www.python.org/dev/peps/pep-0008/#package-and-module-names
.. _`PEP 426`: http://www.python.org/dev/peps/pep-0426/
.. _`PEP 420`: http://www.python.org/dev/peps/pep-0420/
.. _`PEP 3108`: http://www.python.org/dev/peps/pep-3108/
.. _`Python community`: http://www.python.org/community/
.. _`gp.fileupload`: http://pypi.python.org/pypi/gp.fileupload/
.. _`zest.releaser`: http://pypi.python.org/pypi/zest.releaser/
.. _`django`: http://djangoproject.com/
.. _`flask`: http://flask.pocoo.org/
.. _`sphinx`: http://sphinx.pocoo.org
.. _`pypi`: http://pypi.python.org
.. _`collective.recaptcha`:
   http://pypi.python.org/pypi/collective.recaptcha/
.. _`pipeline`: http://pypi.python.org/pypi/pipeline/ 
.. _`python-pipeline`: http://pypi.python.org/pypi/python-pipeline/
.. _`django-pipeline`: http://pypi.python.org/pypi/django-pipeline/
.. _`setup script reference`:
   http://docs.python.org/dev/packaging/setupscript.html
.. _`setuptools`: http://pypi.python.org/pypi/setuptools
.. _`distribute`: http://packages.python.org/distribute/
.. _`Pillow`: http://pypi.python.org/pypi/Pillow/
.. _`PIL`: http://pypi.python.org/pypi/PIL/
.. _`celery`: http://pypi.python.org/pypi/celery/
.. _`RabbitMQ`: http://www.rabbitmq.com
.. _`DateUtils`: http://pypi.python.org/pypi/DateUtils/
.. _`Framework :: Twisted`:
   http://pypi.python.org/pypi?:action=browse&show=all&c=525
.. _`The Zen of Python`: http://www.python.org/dev/peps/pep-0020/
.. _`Plone community`: http://plone.org/community/develop
.. _`Registering with the Package Index`:
   http://docs.python.org/dev/packaging/packageindex.html
.. _`Python Standard Library`:
   http://docs.python.org/library/index.html
.. _`PEP 426 about obsoleted_by`:
   http://www.python.org/dev/peps/pep-0426/#obsoleted-by
.. _`setup.cfg specification`:
   http://docs.python.org/dev/packaging/setupcfg.html
.. _`Martin Aspeli's article about names`:
   http://www.martinaspeli.net/articles/the-naming-of-things-package-names-and-namespaces
.. _`in development official packaging documentation`:
   http://docs.python.org/dev/packaging/
.. _`The Hitchhiker's Guide to Packaging`:
   http://guide.python-distribute.org/specification.html#naming-specification


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


More information about the Python-Dev mailing list