python newbie

Sun Nov 4 17:21:21 EST 2007

On Sun, 04 Nov 2007 12:05:35 -0800, Paul Rubin wrote:

> Bruno Desthuilliers <bdesth.quelquechose at free.quelquepart.fr> writes:
>> >   from random import random
>> >   x = random()
>> >   y = random.choice((1,2,3))   # oops
>> 
>> from random import random, choice
>> 
>> x = random()
>> y = choice((1, 2, 3))
> 
> Really, a lot of these modules exist primarily to export a single class
> or function, but have other classes or functions of secondary interest. 
> I'd like to be able to get to the primary function without needing to
> use a qualifier, and be able to get to the secondary ones by using a
> qualifier instead of having to import explicitly and clutter up the
> importing module's name space like that.  It just seems natural.

+1 on Paul's suggestion. It's a style thing: it is so much more elegant 
and obvious than the alternatives.

A module is a single conceptual unit. It might very well have a rich 
internal API, but many modules also have a single main function. It might 
be a function or class with the same name as the module, or it might 
literally be called "main". It might be designed to be called from the 
shell, as a shell script, but it need not be.

A good clue that you're dealing with such a module is if you find 
yourself usually calling the same object from the module, or if the 
module has a main class or function with the same name as the module. The 
module is, in effect, a single functional unit.

Possible candidates, in no particular order: StringIO, random, glob, 
fnmatch, bisect, doctest, filecmp, ...

Note that they're not *quite* black boxes: at times it is useful to crack 
that functional unit open to access the internals: the module's "not 
main" functions. But *much of the time*, you should be able to use a 
module as a black box, without ever caring about anything inside it:

import module
module(data) # do something with the module

The model I have in mind is that of shell-scripting languages, like Bash. 
For the purpose of syntax, Bash doesn't distinguish between built-in 
commands and other Bash scripts, but Python does. When you call a script 
(module) in Bash, you don't need to qualify it to use it:

ls > myscript --options

instead of ls > myscript.myscript --options

If you think of modules (at least sometimes) as being the equivalent of 
scripts, only richer and more powerful, then the ugliness of the current 
behaviour is simply obvious.

This model of modules as scripts isn't appropriate for all modules, but 
it is backwards compatible with the current way of doing things (except 
for code that assumes that calling a module will raise an exception). For 
those modules that don't have a single main function, simply don't define 
__call__.

The "natural" way to treat a module as a functional whole *and* still be 
able to crack it open to access the parts currently is some variation of:

from module import main
import module

That's inelegant for at least six reasons:

(1) There's no standard interface for module authors. Should the main 
function of the module be called "main" or "__main__" or "__call__" or 
something derived from the name of the module? The same as the module 
name?

(2) Namespace pollution. There are two names added to the namespace just 
to deal with a single conceptual unit.

(3) The conceptual link between the module and it's main function is now 
broken. There is no longer any obvious connection between main() and the 
module. The two are meant to be joined at the hip, and we're treating 
them as independent things.

(4) What happens if you have two modules you want to treat this way, both 
with a main function with the same name? You shouldn't have to say "from 
module import main as main2" or similar.

(5) You can't use the module as a black box. You *must* care about the 
internals, if only to find out the name of the main function you wish to 
import.

(6) The obvious idiom uses two imports (although there is an almost as 
obvious idiom only using one). Python caches imports, and the second is 
presumably much faster than the first, but it would be nice to avoid the 
redundant import.

As far as I know, all it would take to allow modules to be callable would 
be a single change to the module type, the equivalent of:

def __call__(self, *args, **kwargs):
    try:
        # Special methods are retrieved from the class, not
        # from the instance, so we need to see if the 
        # instance has the right method.
        callable = self.__dict__['__call__']
    except KeyError:
        return None # or raise an exception?
    return callable(self, *args, **kwargs)

-- 
Steven