[XML-SIG] [OT] Python Coding Standards
Tue, 4 Mar 2003 22:13:11 -0500
This came across Mitch Kapor's dev list. Looked pretty well done, thought I
would share it.
Content-Description: David Jeske <email@example.com>: [Dev] Python Coding Standards
Received: (qmail 8320 invoked by uid 7794); 5 Mar 2003 02:32:38 -0000
Received: from firstname.lastname@example.org by mail.zoper.com by uid 89 with qmail-scanner-1.15
(clamscan: 0.53. spamassassin: 2.43. Clear:SA:0(-2.5/5.0):.
Processed in 5.782846 secs); 05 Mar 2003 02:32:38 -0000
Received: from unknown (HELO aloha.osafoundation.org) (18.104.22.168)
by 0 with DES-CBC3-SHA encrypted SMTP; 5 Mar 2003 02:32:32 -0000
Received: from aloha.osafound.org (localhost [127.0.0.1])
by aloha.osafoundation.org (8.12.8/8.12.7) with ESMTP id h252SfF9011543;
Tue, 4 Mar 2003 20:28:41 -0600
Received: from mozart.chat.net (cb4.neotonic.com [22.214.171.124])
by aloha.osafoundation.org (8.12.8/8.12.7) with SMTP id h252RqF9011522
for <email@example.com>; Tue, 4 Mar 2003 20:27:53 -0600
Received: (qmail 26662 invoked by uid 120); 5 Mar 2003 02:27:52 -0000
From: David Jeske <firstname.lastname@example.org>
X-Mailer: Mutt 1.0pre3i
X-Scanned-By: MIMEDefang 2.29 (www . roaringpenguin . com / mimedefang)
Subject: [Dev] Python Coding Standards
List-Id: Developer Discussions <dev.osafoundation.org>
Date: Tue, 4 Mar 2003 18:27:52 -0800
X-Spam-Status: No, hits=-2.5 required=5.0
It took me a while to get someone still at Yahoo to find the original
document for me. Here it is, and I've added some comments about the
motivations behind the conventions. I apologize for its length, but
this is all really useful stuff.
> Python Coding Standards
> Authors: David W. Jeske, Brandon C. Long
> This document defines the eGroups Python coding conventions moving
> forward. It is based on previous standards, but clears up a few areas of
> * Don't use acronyms. Spell things ouw when possible. (ex
> VoiceChatPage.py, not VCPage.py)
Without compile-time static errors, getting names right becomes much
more important. Therefore we used long-ish names and quasi-symbol
completion (available in the Python emacs mode) to help avoid using
the wrong class/variable/method names.
> * Global variables should begin with a "g", example: gUserServerPort
> * Constants should begin with a "k", example: kYes
> * Class names start with a capital letter and capitalize each word.
> example: FooBarCom
> * Don't create "double maintaniance", see special section below!
This seems like just good programming practice, but it's much more
important in Python where forgetting to do step 2 of 5 in adding
something won't be a compile error.
> * Don't make extraneous protcols/classes. Do your homework. Find out if
> something similar already exists first. One consistant (sub-optimal)
> way is better than five perfect ways. Fewer classes is better
> * Don't generalize too early. Code is easier to read if it only calls
> well-known functions or local-functions. If you have found only two
> unrelated parts of the system which want to use the same code snippit,
> it may be better off leaving that code snippit in both places.
> * A code snippit should only be reused if it is in one of the
> "library" portions of the system. (i.e. don't import a web-page
> to reuse it's code)
> * A code snippit should be in a library portion of the system only
> if either (a) it is used in many of different places in the
> software, (b) the behavior of the different places should
> DEFINETLY be coupled (i.e. when you change the shared code it
> should change in both places).
While another good general rule, because it's really really hard to
track dependencies in Python, this becomes much more important for
> * Don't put code in a class which does not need the class data. (i.e. it
> should be a function somewhere!)
> * Keep "public" APIs simple. Document them as such. Be sure to document
> the _intent_ of the API. If you don't, someone will read the code and
> make assumptions based on the current implementation.
> * If you change what a functions or method's intent is, CHANGE ITS NAME!
This turned out to be one of the most important rules. It turned into:
* If you must add required arguments to a function either (a)
change it's name, or better still (b) make a new named
function and leave the old one.
Even with unit testing, it's really hard to track dependencies like
this. We would need code coverage tools and exhaustive tests, neither
of which we had.
> * In the "new world order" we want one class per file inside a proper
> package directory.
We originally had lots of files in a single directory, with each file
itself being a module. However, one file is just not enough for
expanding functionality, so we moved to (roughly) one file per class,
and making different directories for different packages. We tried to
use the Python "package directory" support but it turned out to be
somewhat problematic to reorganize. Because there is no static
compile, hunting down all the places which referenced something from
its old location was not always easy.
In my more recent Python projects we use a more "c-like" model. We
have several directories for different functions, but we don't use the
"package naming", instead we just put all the paths in the
classpath. This allows us to expand a single file module into many
files and eventually move it into its own directory without affecting
any existing code.
> * Lists should always be items of the same type.
> * Position information in lists should not matter.
This is really important. In fact, today I would add:
* Only use tuples locally. Do not pass them around.
Tracking dependencies between the places that use something and the
places that pack something is hard. This is the same fragility that
exists in lots of LISP software.
> * Avoid using dictionaries as structures. If you have a pre-defined set
> of attributes, use a class and access them like attributes. In other
> words, qualifiers for dictionaries should almost always be variables,
> not constants. In general, you should probably:
> Do this: an_object.name
> Not this: a_dict["name"]
> Do this: a_dict[a_key]
> Not this: value = getattr(an_object,a_key)
This more sensibly works with command completion as well. Using
strings as symbols is evil.
> * Private (most) instance variables should start with an underscore.
While this is worded badly, it's trying to say that most instance
variables should be private and all private variables should start
with an underscore. (i.e. when in doubt make it private)
> * All eGroups API methods should use the eGroups naming convention
> described below
> * Avoid accessing private instance data directly from outside an object
> * Methods (and functions) should always return items of the same type.
> (i.e. do not return a tuple in one case, and a number in another)
This is also really important. This means that we also did NOT ever
return "None" as a NULL pointer value for a failure case. It's just
too hard to track down these problems in Python, so we always throw
custom exceptions for failure cases.
> * Keyword arguments should always be optional. If the argument is
> required, make it a non-keyword argument, and put the argument name
> into the method name by using our method naming convention.
> * Always include the module and class name in an error declaration.
> * Avoid try/except blocks for a dictionary, use .has_key() or get(,)
> * ANY time you have an unqualified "except:" clause, you must report the
> error with the handleexception module. Like this:
> # do something
> import handleexception
handleexception was a standard module which grabbed a stack backtrace
and reported that the error occured. This allowed us to track down
these "global exceptions" that really shouldn't happen. If something
happens alot, it should get a custom exception handler.
We also built a really nice tool which cataloged all these exceptions
which occured on the running code for any process in the website
service so we could browse through the real exception state and
I would also add:
* Code should be designed to fail earlier rather than later
* Classes should be as "stateless" as possible. To achieve
this a class should be usable immediately after construction,
and work functions should take optional arguments instead of
requiring you to call state-change methods.
We had some troublesome code conceptually similar to this:
foo = SomeClass()
As the complexity increases, it becomes hard to figure out what you
must set to make the class "usable", and it was hard to tell the
difference between what was state of making the class "useful"
vs. "doing the work". This code above should be:
foo = SomeClass(bar) # bar required to make the class "useful"
foo.doWork(otherState=baz) # baz is an optional work paramater
The method naming convention below was used for most "public" parts of
our library api. Along with the simple emacs Python symbol completion,
this really helped make the APIs easy to use. It is sort of the
poor-man's dynamic typed version of Intellisense.
> Method Naming
> Public method names should:
> 1) start with a lower-case letter
> 2) upper-case the first letter of each word
> 3) include a description of the required arguments it takes, each
> followed by an underscore.
> Here are some examples:
> def charset_(self, o) ## takes one argument, the charset, in MessagePage.py
> def getPodForGroup_(self,groupname) ## takes one argument, the "Group", in DataCluster.py
> def getSubsForUser_(self,username) ## takes one argument, the "User", in userserv.py
> This is inspired by the Objective-C "descriptive" method syntax:
> [object method: arg1 withName1: arg2 withName2: arg3];
> Where you get things like:
> [object getPodForGroup: "stevestest" andUser: "email@example.com"]
> Which is a hell of alot more understandable than the C/C++ ish:
(i.e. when you can't look up the symbol easily)
> In the following lisp-esq funtion, there is less strong a
> relationship between the name and the arguments:
> (i.e. the above could mean any of:
David Jeske (N9LCA) + http://www.chat.net/~jeske/ + firstname.lastname@example.org
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Open Source Applications Foundation "Dev" mailing list