How to import only one module in a package when the package __init__.py has already imports the modules?

Robert Kern robert.kern at gmail.com
Sat Oct 31 19:06:09 EDT 2009


On 2009-10-31 16:53 PM, Peng Yu wrote:
> On Sat, Oct 31, 2009 at 4:14 PM, Robert Kern<robert.kern at gmail.com>  wrote:
>> On 2009-10-31 15:31 PM, Peng Yu wrote:
>>
>>> The original problem comes from the maintenance of the package. When A
>>> and B are large classes, it is better to put them in separate files
>>> under the directory 'test' than put them in the file 'test.py'. The
>>> interface 'test.A' is used by end users. However, there will be a
>>> problem if 'import test' is used for developers, because both A and B
>>> are imported, which cause dependence between A and B. For example,
>>> during the modification of B (not finished), 'import A' would not
>>> work. This is means that modifications of A and B are not independent,
>>> which cause a lot of problem when maintaining the package.
>>
>> To be frank, that development process is going to cause you a lot of
>> problems well beyond these import entanglements. Developers should have
>> their own workspace! They shouldn't push things into production until the
>> system is working. Checking something into source control shouldn't
>> automatically deploy things into production.
>
> I don't quite agree with your opinion. But please don't take it too personaly.
>
> Even in the developer's work space, it is possible to change multiple
> classes simultaneously. So the import entanglement problem still
> exists.

But it's a problem that should have different consequences than you are 
claiming. Having users prevented from using A because developers are modifying 
their copy of B in production is a problem that needs to be solved by changing 
your development process. If you don't change your development process, you will 
run into the same problems without import entanglements.

Now as to import entanglements in the developer's workspace, it is true that 
they can cause issues from time to time, but they are much, much smaller in 
practice. I can just go in and comment out the offending import temporarily 
while I finish working on the other part until I'm ready to address both of them 
together. Then when I'm finished and things are working again, I can check my 
code into source control. It's just not a big deal.

>>> Naming the filename different from the class is a solution, but it is
>>> a little bit annoying.
>>>
>>> I'm wondering how people handle this situation when they have to
>>> separate a module into multiple modules.
>>
>> Even if we organize things along the lines of "one class per module", we use
>> different capitalization conventions for modules and classes. In part, this
>> helps solve your problem, but it mostly saves the developer thought-cycles
>> from having to figure out which you are referring to when reading the code.
>
> I know that multiple classes or functions are typically defined in one
> file (i.e. module in python). However, I feel this make the code not
> easy to read. Therefore, I insist on one class or function per file
> (i.e module in python).

One function per file is a little extreme. I am sympathetic to "one class per 
module", but functions *should* be too short too warrant a module to themselves.

> When one class per module is strictly enforced, there will be no need
> to have different capitalization conventions for modules and classes.
> Developers should be able to tell whether it is a class or a module
> from the context.

Given enough brain-time, but you can make your code easier to read by using 
different conventions for different things. Developer brain-time is expensive! 
As much as possible, it should be spent on solving problems, not comprehension.

> In my question, module A and B exist just for the sake of
> implementation. Even if I have module A and B, I don't want the user
> feel the existence of module A and B. I want them feel exact like
> class A and B are defined in module 'test' instead of feeling two
> modules A and B are in package 'test'. I know that module names should
> be in lower cases, in general. However, it is OK to have the module
> name capitalized in this case since the end users don't see them.
>
> In C++, what I am asking can be easily implemented, because the
> namespace and the directory hierachy is not bind to each other.
> However, the binding between the namespace and the directory hierachy
> make this difficult to implement. I don't know if it is not
> impossible, but I'd hope there is a way to do so.

I'm not sure that C++ is a lot better. I still have to know the file hierarchy 
in order to #include the right files. Yes, the namespaces get merged when you go 
to reference things in the code, but those #includes are intimately tied to the 
file hierarchy.

In C++, you can often #include one file that #includes everything else because 
linking won't bring in the symbols you don't actually use. Oddly enough, we 
don't have that luxury because we are in a dynamic language. Python imports have 
runtime consequences because there is no compile or link step. You can't think 
of import statements as #include statements and need to use different patterns.

Of course, to really take advantage of that feature in C++ requires some careful 
coding and use of patterns like pimpl. That often negates any readability benefits.

You could probably hack something (and people have), but it makes your code 
harder to understand because it is non-standard.

>> Personally, I like to keep my __init__.py files empty such that I can import
>> exactly what I need from the package. This allows me to import exactly the
>> module that I need. In large packages with extension modules that can be
>> expensive to load, this is useful. We usually augment this with an api.py
>> that exposes the convenient "public API" of the package, the A and B classes
>> in your case.
>
> I looked at python library, there are quite a few __init__.py files
> are not empty. In fact, they are quite long. I agree with you that
> '__init__.py' should not be long. But I'm wondering why in python
> library __init__.py are quite long.

For the most part, it's just not an issue. If you are seeing serious problems, 
this may just be exposing deeper issues with your code and your process that 
will come to bite you in other contexts sooner or later.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco




More information about the Python-list mailing list