organizing your scripts, with plenty of re-use

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sat Oct 3 04:28:59 EDT 2009


On Fri, 02 Oct 2009 18:14:44 -0700, bukzor wrote:

> I would assume that putting scripts into a folder with the aim of re-
> using pieces of them would be called a package, 

A package is a special arrangement of folder + modules. To be a package, 
there must be a file called __init__.py in the folder, e.g.:

parrot/
+-- __init__.py
+-- feeding/
    +-- __init__.py
    +-- eating.py
    +-- drinking.py
+-- fighting.py
+-- flying.py
+-- sleeping.py
+-- talking.py


This defines a package called "parrot" which includes a sub-package 
feeding and modules fighting, flying, sleeping and talking. You can use 
it by any variant of the following:

import parrot  # loads parrot/__init__.py
import parrot.talking  # loads parrot/talking.py
from parrot import sleeping
import parrot.feeding
from parrot.feeding.eating import eat_cracker

and similar.

Common (but not compulsory) behaviour is for parrot/__init__.py to import 
all the modules within the package, so that the caller can do this:

import parrot
parrot.feeding.give_cracker()

without requiring to manually import sub-packages. The os module behaves 
similarly: having imported os, you can immediately use functions in 
os.path without an additional import.

Just dumping a bunch of modules in a folder doesn't make it a package, it 
just makes it a bunch of modules in a folder. Unless that folder is in 
the PYTHONPATH, you won't be able to import the modules because Python 
doesn't look inside folders. The one exception is that it will look 
inside a folder for a __init__.py file, and if it finds one, it will 
treat that folder and its contents as a package.


> but since this is an
> "anti-pattern" according to Guido, apparently I'm wrong-headed here.
> (Reference:
> http://mail.python.org/pipermail/python-3000/2007-April/006793.html )

Guido's exact words were:

"The only use case seems to be running scripts that happen
to be living inside a module's directory, which I've always seen as an
antipattern."

I'm not sure precisely what he means by that, because modules don't have 
directories, they are in directories. Perhaps he meant package.

In that case, the anti-pattern according to Guido is not to put modules 
in a folder, but to have modules inside a package be executable scripts. 
To use the above example, if the user can make the following call from 
the shell:

$ python ./parrot/talking.py "polly want a cracker"

and have the module talking do something sensible, that's an anti-
pattern. Modules inside a package aren't intended to be executable 
scripts called by the user. There should be one (or more) front-end 
scripts which are called by the user. Since they aren't intended to be 
imported, they can be anywhere, not just on the PYTHONPATH. But they 
import the modules in the package, and that package *is* in the 
PYTHONPATH.

Using the above example, you would install the parrot folder and its 
contents somewhere on the PYTHONPATH, and then have a front-end script 
(say) "talk-to-parrot" somewhere else. Notice that the script doesn't 
even need to be a legal name for a module, since you're never importing 
it.



> Say you have ~50 scripts or so with lots of re-use (importing from each
> other a lot) and you want to organize them into folders. How do you do
> this simply?

Of course you can have a flat hierarchy: one big folder, like the 
standard library, with a mixed back of very loosely connected modules:

eating.py
drinking.py
feeding.py
fighting.py
flying.py
parrot.py
sleeping.py
talking.py


You can do that, of course, but it's a bit messy -- what if somebody 
installs parrot.py and eating.py, but not drinking.py, and as a 
consequence parrot.py fails to work correctly? Or what if the user 
already has a completely unrelated module talking.py? Chaos.

The std library can get away with dumping (nearly) everything in the one 
directory, because it's managed chaos. Users aren't supposed to pick and 
choose which bits of the standard library get installed, or install other 
modules in the same location.

Three alternatives are:

* put your modules in a sub-folder, and tell the user to change the 
Python path to include your sub-folder, and hope they know what you're 
talking about;

* put your modules in a package, tell the user to just place the entire 
package directory where they normally install Python code, and importing 
will just work; or

* have each and every script manually manipulate the PYTHONPATH so that 
when the user calls that script, it adds its parent folder to the 
PYTHONPATH before importing what it needs. Messy and ugly.

 

-- 
Steven



More information about the Python-list mailing list