[Tutor] inter-module global variable

Sun Mar 28 12:50:46 CEST 2010

On Sun, 28 Mar 2010 08:31:57 pm spir ☣ wrote:
> Hello,
>
> I have a main module importing other modules and defining a top-level
> variable, call it 'w' [1]. I naively thought that the code from an
> imported module, when called from main, would know about w, 

Why would it?

If you write a module M, you can't control what names exist in the 
calling module, and you shouldn't have to. Imagine if you wrote a 
module containing a function f, and it was imported by another module 
also containing f, and then suddenly all your module's functions 
stopped working! This would be a disaster:

# mymodule.py
def f(n):
    return n+1

def spam(n):
    return "spam"*f(n)

# caller.py
def f(n):
    return range(23, 45+n, 6)

from mymodule import spam
print spam(2)  # expect "spamspamspam", get TypeError instead

> but I 
> have name errors. The initial trial looks as follows (this is just a
> sketch, the original is too big and complicated):
>
> # imported "code" module
> __all__ = ["NameLookup", "Literal", "Assignment", ...]
>
> # main module
> from parser import parser

By the way, have you looked at PyParsing? This is considered by many to 
be the gold standard in Python parsing libraries.

> from code import *

This is discouraged strongly. What happens if the code module has 
something called parser? Or len?

> from scope import Scope, World
> w = World()
>
> This pattern failed as said above. 

What do you mean "failed"? Nothing you show is obviously broken.

> So, I tried to "export" w: 
>
> # imported "code" module
> __all__ = ["NameLookup", "Literal", "Assignment", ...]
>
> # main module
> from parser import parser
> from scope import Scope, World
> w = World()
> import code		#    new
> code.w = w		### "export"
> from code import *
>
> And this works. I had the impression that the alteration of the
> "code" module object would not propagate to objects imported from
> "code". But it works. 

It sounds like you are trying to write PHP code in Python.

> But I find this terribly unclear, fragile, and 
> dangerous, for any reason. (I find this "dark", in fact ;-) Would
> someone try to explain what actually happens in such case? 

Yep, sounds like PHP code :)

Every function and class in a module stores a reference to their 
enclosing globals, so that when you do this:

# module A.py
x = "Hello world"

def f():
    print x

# module B.py
from A import f
f()
=> prints "Hello world" as expected.

You don't have to do anything to make this work: every class and 
function knows what namespace it belongs to.

I can only imagine you're trying to do this:

# module A.py
x = "Hello world"

def f():
    print x

# module B.py
x = "Goodbye cruel world!"
from A import f
f()
=> prints "Goodbye cruel world!"

This is bad design. You might think you need it, but in the long run you 
will regret it. You are mixing up arguments and globals. If you want 
the result of f() to depend on the local value of x, then make it take 
an argument:

def f(x):
    print x

and call it:

f(x)

http://c2.com/cgi/wiki?GlobalVariablesAreBad
http://discuss.joelonsoftware.com/default.asp?design.4.249182.18

> Also, why 
> is a global variable not actually global, but in fact only "locally"
> global (at the module level)? It's the first time I meet such an
> issue. What's wrong in my design to raise such a problem, if any?

In Python, that is a deliberate choice. All globals are deliberately 
global to the module. The closest thing to "globally global" is the 
builtins namespace, which is where builtins like len, map, str, etc. 
are found.

Any design which relies on modifying global variables is flawed. Global 
variables are a poor design:

http://weblogs.asp.net/wallen/archive/2003/05/08/6750.aspx

Slightly better than global variables is a design where you use a module 
or class as a namespace, put all your globals in that namespace, then 
pass it to your other classes as an argument:

class SettingsNamespace:
    pass

settings = SettingsNamespace()
settings.x = 42
settings.y = 23
settings.z = "magic"

instance = MyOtherClass(a, b, c, settings)

This is still problematic. For example, if I change settings.x, will the 
result of MyOtherClass be different? Maybe, maybe not... you have to 
dig deep into the code to know which settings are used and which are 
not, and you never know if an innocent-looking call to a function or 
class will modify your settings and break things.

> My view is a follow: From the transparency point of view (like for
> function transparency), the classes in "code" should _receive_ as
> general parameter a pointer to 'w', before they do anything.

Yes, this is better than "really global" globals, but not a lot better.

> In other 
> words, the whole "code" module is like a python code chunk
> parameterized with w. If it would be a program, it would get w as
> command-line parameter, or from the user, or from a config file.
> Then, all instanciations should be done using this pointer to w.
> Meaning, as a consequence, all code objects should hold a reference
> to 'w'. This could be made as follows:

If every code object has a reference to the same object w, that defeats 
the purpose of passing it as an argument. It might be local in name, 
but in practice it is "really global", which is dangerous.

> # main module
> import code
> code.Code.w = w

Why not just this?

code.w = w

And where does w come from in the first place? Shouldn't it be defined 
in code.py, not the calling module?

> from code import *

This is generally frowned upon. You shouldn't defeat Python's 
encapsulation of namespaces in that way unless you absolutely have to.

> # "code" module
> class Code(object):
>     w = None	### to be exported from importing module

That sets up a circular dependency that should be avoided: Code objects 
are broken unless the caller initialises the class first, but you can't 
initialise the class unless you import it. Trust me, you WILL forget to 
initialise it before using it, and then spend hours trying to debug the 
errors.

>     def __init__(self, w=Code.w):
>         # the param allows having a different w eg for testing
>         self.w = w

This needlessly gives each instance a reference to the same w that the 
class already has. Inheritance makes this unnecessary. You should do 
this instead:

class Code(object):
    w = None  # Better to define default settings here.
    def __init__(self, w=None):
        if w is not None:
            self.w = w

If no w is provided, then lookups for instance.w will find the shared 
class attribute w.

[...]
> But the '###' line looks like  an ugly trick to me. (Not the fact
> that it's a class attribute; as a contrary, I often use them eg for
> config, and find them a nice tool for clarity.) The issue is that
> Code.w has to be exported. 

It is ugly, and fragile. It means any caller is *expected* to modify the 
w used everywhere else, in strange and hard-to-predict ways.

-- 
Steven D'Aprano