[Python-checkins] CVS: python/nondist/peps pep-0008.txt,NONE,1.1

Barry Warsaw bwarsaw@users.sourceforge.net
Thu, 05 Jul 2001 11:56:14 -0700


Update of /cvsroot/python/python/nondist/peps
In directory usw-pr-cvs1:/tmp/cvs-serv31982

Added Files:
	pep-0008.txt 
Log Message:
Guido's famous Python Style Guide essay, converted to PEP format,
spellchecked and mildly edited.  It's still as incomplete as the
former.


--- NEW FILE: pep-0008.txt ---
PEP: 8
Title: Style Guide for Python Code
Version: $Revision: 1.1 $
Author: guido@python.org (Guido van Rossum),
    barry@digicool.com (Barry Warsaw)
Status: Active
Type: Informational
Created: 05-Jul-2001
Post-History:


Introduction

    This document gives coding conventions for the Python code
    comprising the standard library for the main Python distribution.
    Please see the companion informational PEP describing style
    guidelines for the C code in the C implementation of Python[1].

    Note, rules are there to be broken.  Two good reasons to break a
    particular rule:

    (1) When applying the rule would make the code less readable, even
        for someone who is used to reading code that follows the rules.

    (2) To be consistent with surrounding code that also breaks it
        (maybe for historic reasons) -- although this is also an
        opportunity to clean up someone else's mess (in true XP style).

    This document was adapted from Guido's original Python Style
    Guide essay[2].  This PEP inherits that essay's incompleteness.


Code lay-out

  Indentation

    Use the default of Emacs' Python-mode: 4 spaces for one
    indentation level.  For really old code that you don't want to
    mess up, you can continue to use 8-space tabs.  Emacs Python-mode
    auto-detects the prevailing indentation level used in a file and
    sets its indentation parameters accordingly.

  Tabs or Spaces?

    Never mix tabs and spaces.  The most popular way of indenting
    Python is with spaces only.  The second-most popular way is with
    tabs only.  Code indented with a mixture of tabs and spaces should
    be converted to using spaces exclusively.  (In Emacs, select the
    whole buffer and hit ESC-x untabify.)  When invoking the python
    command line interpreter with the -t option, it issues warnings
    about code that illegally mixes tabs and spaces.  When using -tt
    these warnings become errors.  These options are highly
    recommended!

  Maximum Line Length

    There are still many devices around that are limited to 80
    character lines.  The default wrapping on such devices looks ugly.
    Therefore, please limit all lines to a maximum of 79 characters
    (Emacs wraps lines that are exactly 80 characters long.)

    The preferred way of wrapping long lines is by using Python's
    implied line continuation inside parentheses, brackets and braces.
    If necessary, you can add an extra pair of parentheses around an
    expression, but sometimes using a backslash looks better.  Make
    sure to indent the continued line appropriately.  Emacs
    Python-mode does this right.  Some examples:

    class Rectangle(Blob):

        def __init__(self, width, height,
                     color='black', emphasis=None, highlight=0):
            if width == 0 and height == 0 and \
               color == 'red' and emphasis == 'strong' or \
               highlight > 100:
                raise ValueError, "sorry, you lose"
            if width == 0 and height == 0 and (color == 'red' or
                                               emphasis is None):
                raise ValueError, "I don't think so"
            Blob.__init__(self, width, height,
                          color, emphasis, highlight)

  Blank Lines

    Separate top-level function and class definitions with two blank
    lines.  Method definitions inside a class are separated by a
    single blank line.  Extra blank lines may be used (sparingly) to
    separate groups of related functions.  Blank lines may be omitted
    between a bunch of related one-liners (e.g. a set of dummy
    implementations).

    When blank lines are used to separate method definitions, there is
    also a blank line between the `class' line and the first method
    definition.

    Use blank lines in functions, sparingly, to indicate logical
    sections.

    Python accepts the control-L (i.e. ^L) form feed character as
    whitespace; Emacs (and some printing facilities) treat these
    characters as page separators, so you may use them to separate
    pages of related sections of your file.


Whitespace in Expressions and Statements

  Pet Peeves

    Guido hates whitespace in the following places:

    - Immediately inside parentheses, brackets or braces, as in:
      "spam( ham[ 1 ], { eggs: 2 } )".  Always write this as
      "spam(ham[1], {eggs: 2})".

    - Immediately before a comma, semicolon, or colon, as in:
      "if x == 4 : print x , y ; x , y = y , x".  Always write this as
      "if x == 4: print x, y; x, y = y, x".

    - Immediately before the open parenthesis that starts the argument
      list of a function call, as in "spam (1)".  Always write
      this as "spam(1)".

    - Immediately before the open parenthesis that starts an indexing or
      slicing, as in: "dict ['key'] = list [index]".  Always
      write this as "dict['key'] = list[index]".

    - More than one space around an assignment (or other) operator to
      align it with another, as in:

          x             = 1
          y             = 2
          long_variable = 3

      Always write this as

          x = 1
          y = 2
          long_variable = 3

    (Don't bother to argue with him on any of the above -- Guido's
    grown accustomed to this style over 15 years.)


  Other Recommendations

    - Always surround these binary operators with a single space on
      either side: assignment (=), comparisons (==, <, >, !=, <>, <=,
      >=, in, not in, is, is not), Booleans (and, or, not).

    - Use your better judgment for the insertion of spaces around
      arithmetic operators.  Always be consistent about whitespace on
      either side of a binary operator.  Some examples:

          i = i+1
          submitted = submitted + 1
          x = x*2 - 1
          hypot2 = x*x + y*y
          c = (a+b) * (a-b)
          c = (a + b) * (a - b)

    - Don't use spaces around the '=' sign when used to indicate a
      keyword argument or a default parameter value.  For instance:

          def complex(real, imag=0.0):
              return magic(r=real, i=imag)


Comments

    Comments that contradict the code are worse than no comments.
    Always make a priority of keeping the comments up-to-date when the
    code changes!

    If a comment is a phrase or sentence, its first word should be
    capitalized, unless it is an identifier that begins with a lower
    case letter (never alter the case of identifiers!).

    If a comment is short, the period at the end is best omitted.
    Block comments generally consist of one or more paragraphs built
    out of complete sentences, and each sentence should end in a
    period.

    You can use two spaces after a sentence-ending period.

    As always when writing English, Strunk and White apply.

    Python coders from non-English speaking countries: please write
    your comments in English, unless you are 120% sure that the code
    will never be read by people who don't speak your language.


  Block Comments

    Block comments generally apply to some (or all) code that follows
    them, and are indented to the same level as that code.  Each line
    of a block comment starts with a # and a single space (unless it
    is indented text inside the comment).  Paragraphs inside a block
    comment are separated by a line containing a single #.  Block
    comments are best surrounded by a blank line above and below them
    (or two lines above and a single line below for a block comment at
    the start of a a new section of function definitions).

  Inline Comments

    An inline comment is a comment on the same line as a statement.
    Inline comments should be used sparingly.  Inline comments should
    be separated by at least two spaces from the statement.  They
    should start with a # and a single space.

    Inline comments are unnecessary and in fact distracting if they state
    the obvious.  Don't do this:

        x = x+1                 # Increment x

    But sometimes, this is useful:

        x = x+1                 # Compensate for border


Documentation Strings

    Conventions for writing good documentation strings
    (a.k.a. "docstrings") are immortalized in their own PEP[3].


Version Bookkeeping

    If you have to have RCS or CVS crud in your source file, do it as
    follows.

        __version__ = "$Revision: 1.1 $"
        # $Source: /cvsroot/python/python/nondist/peps/pep-0008.txt,v $

    These lines should be included after the module's docstring,
    before any other code, separated by a blank line above and
    below.


Naming Conventions

    The naming conventions of Python's library are a bit of a mess, so
    we'll never get this completely consistent -- nevertheless, here
    are some guidelines.

  Descriptive: Naming Styles

    There are a lot of different naming styles.  It helps to be able
    to recognize what naming style is being used, independently from
    what they are used for.

    The following naming styles are commonly distinguished:

    - x (single lowercase letter)

    - X (single uppercase letter)

    - lowercase

    - lower_case_with_underscores

    - UPPERCASE

    - UPPER_CASE_WITH_UNDERSCORES

    - CapitalizedWords (or CapWords)

    - mixedCase (differs from CapitalizedWords by initial lowercase
      character!)

    - Capitalized_Words_With_Underscores (ugly!)

    There's also the style of using a short unique prefix to group
    related names together.  This is not used much in Python, but it
    is mentioned for completeness.  For example, the os.stat()
    function returns a tuple whose items traditionally have names like
    st_mode, st_size, st_mtime and so on.  The X11 library uses a
    leading X for all its public functions.  (In Python, this style is
    generally deemed unnecessary because attribute and method names
    are prefixed with an object, and function names are prefixed with
    a module name.)<

    In addition, the following special forms using leading or trailing
    underscores are recognized (these can generally be combined with any
    case convention):

    - _single_leading_underscore: weak "internal use" indicator
      (e.g. "from M import *" does not import objects whose name
      starts with an underscore).

    - single_trailing_underscore_: used by convention to avoid
      conflicts with Python keyword, e.g.
      "Tkinter.Toplevel(master, class_='ClassName')".

    - __double_leading_underscore: class-private names as of Python 1.4.

    - __double_leading_and_trailing_underscore__: "magic" objects or
      attributes that live in user-controlled namespaces,
      e.g. __init__, __import__ or __file__.  Sometimes these are
      defined by the user to trigger certain magic behavior
      (e.g. operator overloading); sometimes these are inserted by the
      infrastructure for its own use or for debugging purposes.  Since
      the infrastructure (loosely defined as the Python interpreter
      and the standard library) may decide to grow its list of magic
      attributes in future versions, user code should generally
      refrain from using this convention for its own use.  User code
      that aspires to become part of the infrastructure could combine
      this with a short prefix inside the underscores,
      e.g. __bobo_magic_attr__.

  Prescriptive: Naming Conventions

    Module Names

      Module names can be either MixedCase or lowercase.  There is no
      unambiguous convention to decide which to use.  Modules that
      export a single class (or a number of closely related classes,
      plus some additional support) are often named in MixedCase, with
      the module name being the same as the class name (e.g. the
      standard StringIO module).  Modules that export a bunch of
      functions are usually named in all lowercase.

      Since module names are mapped to file names, and some file
      systems are case insensitive and truncate long names, it is
      important that module names be chosen to be fairly short and not
      in conflict with other module names that only differ in the case
      -- this won't be a problem on Unix, but it may be a problem when
      the code is transported to Mac or Windows.

      There is an emerging convention that when an extension module
      written in C or C++ has an accompanying Python module that
      provides a higher level (e.g. more object oriented) interface,
      the Python module's name CapWords, while the C/C++ module is
      named in all lowercase and has a leading underscore
      (e.g. _socket).

      Python packages generally have a short all lowercase name.

    Class Names

      Almost without exception, class names use the CapWords
      convention.  Classes for internal use have a leading underscore
      in addition.

    Exception Names

      If a module defines a single exception raised for all sorts of
      conditions, it is generally called "error" or "Error".  It seems
      that built-in (extension) modules use "error" (e.g. os.error),
      while Python modules generally use "Error" (e.g. xdrlib.Error).

    Function Names

      Plain functions exported by a module can either use the CapWords
      style or lowercase (or lower_case_with_underscores).  There is
      no strong preference, but it seems that the CapWords style is
      used for functions that provide major functionality
      (e.g. nstools.WorldOpen()), while lowercase is used more for
      "utility" functions (e.g. pathhack.kos_root()).

    Global Variable Names

      (Let's hope that these variables are meant for use inside one
      module only.)  The conventions are about the same as those for
      exported functions.  Modules that are designed for use via "from
      M import *" should prefix their globals (and internal functions
      and classes) with an underscore to prevent exporting them.

    Method Names

      The story is largely the same as for functions.  Use lowercase
      for methods accessed by other classes or functions that are part
      of the implementation of an object type.  Use one leading
      underscore for "internal" methods and instance variables when
      there is no chance of a conflict with subclass or superclass
      attributes or when a subclass might actually need access to
      them.  Use two leading underscores (class-private names,
      enforced by Python 1.4) in those cases where it is important
      that only the current class accesses an attribute.  (But realize
      that Python contains enough loopholes so that an insistent user
      could gain access nevertheless, e.g. via the __dict__ attribute.


References

    [1] PEP 7, Style Guide for C Code, van Rossum

    [2] http://www.python.org/doc/essays/styleguide.html

    [3] PEP 257, Docstring Conventions, Goodger, van Rossum


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
End: