How to guard against bugs like this one?

Steven D'Aprano steven at REMOVE.THIS.cybersource.com.au
Mon Feb 1 22:28:55 EST 2010


On Tue, 02 Feb 2010 02:34:07 +0000, kj wrote:

> I just spent about 1-1/2 hours tracking down a bug.
> 
> An innocuous little script, let's call it buggy.py, only 10 lines long,
> and whose output should have been, at most two lines, was quickly
> dumping tens of megabytes of non-printable characters to my screen (aka
> gobbledygook), and in the process was messing up my terminal *royally*. 
> Here's buggy.py:
[...]
> It turns out that buggy.py imports psycopg2, as you can see, and
> apparently psycopg2 (or something imported by psycopg2) tries to import
> some standard Python module called numbers; instead it ends up importing
> the innocent myscript/numbers.py, resulting in *absolute mayhem*.


There is no module numbers in the standard library, at least not in 2.5. 

>>> import numbers
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named numbers

It must be specific to psycopg2.

I would think this is a problem with psycopg2 -- it sounds like it should 
be written as a package, but instead is written as a bunch of loose 
modules. I could be wrong of course, but if it is just a collection of 
modules, I'd definitely call that a poor design decision, if not a bug.


> (This is no mere Python "wart"; this is a suppurating chancre, and the
> fact that it remains unfixed is a neverending source of puzzlement for
> me.)

No, it's a wart. There's no doubt it bites people occasionally, but I've 
been programming in Python for about ten years and I've never been bitten 
by this yet. I'm sure it will happen some day, but not yet.

In this case, the severity of the bug (megabytes of binary crud to the 
screen) is not related to the cause of the bug (shadowing a module).

As for fixing it, unfortunately it's not quite so simple to fix without 
breaking backwards-compatibility. The opportunity to do so for Python 3.0 
was missed. Oh well, life goes on.


> How can the average Python programmer guard against this sort of
> time-devouring bug in the future (while remaining a Python programmer)?
> The only solution I can think of is to avoid like the plague the
> basenames of all the 200 or so /usr/lib/pythonX.XX/xyz.py{,c} files, and
> *pray* that whatever name one chooses for one's script does not suddenly
> pop up in the appropriate /usr/lib/pythonX.XX directory of a future
> release.

Unfortunately, Python makes no guarantee that there won't be some clash 
between modules. You can minimize the risks by using packages, e.g. given 
a package spam containing modules a, b, c, and d, if you refer to spam.a 
etc. then you can't clash with modules a, b, c, d, but only spam. So 
you've cut your risk profile from five potential clashes to only one.

Also, generally most module clashes are far more obvious. If you do this:

import module
x = module.y

and module is shadowed by something else, you're *much* more likely to 
get an AttributeError than megabytes of crud to the screen.

I'm sorry that you got bitten so hard by this, but in practice it's 
uncommon, and relatively mild when it happens.


> What else can one do?  Let's see, one should put every script in its own
> directory, thereby containing the damage.

That's probably a bit extreme, but your situation:

"Both scripts live in a directory filled with *hundreds* little
one-off scripts like the two of them."

is far too chaotic for my liking. You don't need to go to the extreme of 
a separate directory for each file, but you can certainly tidy things up 
a bit. For example, anything that's obsolete should be moved out of the 
way where it can't be accidentally executed or imported.




-- 
Steven



More information about the Python-list mailing list