[Webware-discuss] Duplicate modules problem

Terrel Shumway tshumway at transafari.com
Fri Feb 23 11:29:03 EST 2001


Chuck Esterbrook wrote:

> At 10:05 PM 2/22/2001 -0500, Terrel Shumway wrote:
> > > - Does anyone know why it would be bad for Python to track modules by
> > > absolute path?
> >
> >It would make it more difficult to move modules around. (Java demonstrates
> >this.)  Tool support would help: it is easy to search for import statements.
>
> I don't follow that at all. This is strictly a run time phenomena. Who
> wants to move modules once the program starts?

No, this is not a runtime problem, it is a refactoring-time problem.  In Java,
just try renaming a package and see how long it takes you to get a clean compile
without using a tool like WoodenChair.
The people who designed Python's hierarchical package system wanted people to be
able to gather modules and packages into packages without breaking them.  The
package relative import is a fairly clean solution to this fairly nasty problem.
Tool support would be helpful for external code that uses the moved modules.
Consider this very very simple scenario:

ham.py
---
class Ham:...

eggs.py
---
class Eggs...
class Dozen...
spam.py
---
from ham import Ham
from eggs import EggMixin
class Spam(Ham,EggMixin):...
class Can...

spam_eater.py
---------
from spam import Spam
import eggs

factory = Spam()
breakfast = factory.getCan()
breakfast.eat()
lunch = eggs.Dozen()
lunch.eat()
supper = Eggs.leftovers((breakfast,lunch))
supper.eat("Yum,yum!")


spam_eater is an application that uses ham, eggs, and spam as library modules.
Suppose now that we wanted to serve Spam and Eggs over HTTP.  We could just dump
all of these modules into the same directory with Request,Response,Servlet,
et.al., (or, equivalently, add the directory to sys.path) but that would not be
clean.  Lets put them together into a package: FoodKit.

FoodKit
    +- __init__.py
    +- spam.py
    +- ham.py
    +- eggs.py

bin
    +- spam_eater.py

Now in our web server we can say:
import FoodKit
serveit = FoodKit.spam.Can()

spam,eggs,and ham will work without modification. spam_eater would too, if we had
put it in the package, but we decided that we shouldn't mix applications and
libraries.
Since FoodKit is not in sys.path (If it were, we would get the nasty duplicate
module problems that started this thread.)  spam_eater gets an ImportError because
it cannot find spam or eggs.  The solution is to add the package name, just as the
web server does.

from FoodKit.spam import Spam
from FoodKit import eggs

Finding and fixing all of these broken import statements is what tools can do.
Note that the problem would be much larger without the package relative imports.
(cf. Java)  Also note that this example is extremely simple.  A more realistic
scenario would probably involve multiple packages and dozens of modules. Without
tool support, this type of refactoring tends to get put off until it is really
nasty.

> Also, I'm not really sure if I want to replace '' in sys.path. I thought
> that was there as a relative path to the current module (not the current
> directory) and therefore helped a given file Foo.py say "import Bar from
> Bar" where Bar.py resided in the same package. Particularly if you didn't
> even import Foo directly.  Am I wrong about that?

No, the package relative import works without "" in sys.path.

> As Geoff pointed out, we most likely encountered this problem because we're
> running a program out of a package, which is uncommon. I tried Geoff's fix
> on the example code I wrote and it worked like a charm. I'll try it on
> Webware next.

A program (script) running within a package, if it tries to use the package
relative imports, is broken.  __main__ is in the default package.  Trying to use
relative imports from __main__ is breaking the package encapsulation.

> If the fix pans out for Webware, that will be great, but unless there is a
> concrete advantage to tracking packages by relative path,

There is a concrete advantage: a library module does not need to know where it is
in the package hierarchy. Renaming the package does not require any code changes,
as it does for example in Java, which uses absolute names.  Only external clients
need code changes.

> I still recommend
> a change in Python to track modules by absolute path. That would eliminate
> accidently getting 2 distinct copies of the same module for any Python program.

Python does track modules by *filename*. If all of the names in sys.path are
absolute, then you can never get a filename that is not absolute.

-- Terrel





More information about the Python-list mailing list