weak references and threads

Wed Apr 24 05:19:21 EDT 2002

pekka niiranen || Tue 23 Apr 2002 11:41:55p:

> Could somebody explain me in SIMPLE english what are:
> 
> 1)  weak references

A name that does not force the object it points to to exist.

Normally, our objects stay around for as long as we have a name for them. 
If we write:

a = [1]

That particular list (i used a list because defining a list guarentees a 
unique ojbect) will stay around for as long as the name 'a' exists. If 
this is in a function, the name will be removed at the end of the 
function, if it is in the first level of the file, it will stay around 
untill told to do otherwise. That is, we say:

del a

And the name 'a' is removed. If this variable has no other names at this 
point, it will be removed as well. If on the other hand, we do this:

a = [2]
b = a

We now have two names for the same object (python never copies unless 
asked. Any apparent copying done by an assignemnt is due to the observable 
properties of immutable types (strings, numbers, tuples), and expectations 
of behavior similar to other programming languages. This is the pointer-
ish stumbling block for python, it seems). If we then:

del a

The name 'a' is removed, but because the name 'b' still points at that 
object, the object is unaffected.

Now, let's take a hypothetical function weak(var), which returns a weak 
reference to whatever you pas to it as var. If we write:

a = [3]
w = weak(a)

We again have two names for one variable. But now if we:

del a

The system wil reclaim the [3] list object as if w did not exist. This is 
because weak references complety ignore and bypass the magic-behind-the-
scenes system called 'reference counting' that lets Python keep track of 
when objects acutally need to be removed. This means you genereally have 
to check and see if the object still exists each time you use it, or set 
up some object to be notified when it gets destroied, and do your own 
bookkeeping.

There is only one reason I know that you would want to use weak 
references, and that is in the creation of advanced, customized data 
structures. The point is to avoid something called 'circular references', 
which is probably best explained by an example. Let's take a realy basic 
class (this is right from my 2.2 IDLE prompt):

>>> class pointer:
	def __del__(self):
		print 'Object being removed now!'

All this does is print a message when the object is deleted. A simple 
control case shows how this works:

>>> a = pointer()
>>> del a
Object being removed now!

But if we create two of these objects:

>>> a = pointer()
>>> b = pointer()

And add a regular name to each that points to the other:

>>> a.target = b
>>> b.target = a

We can then remove both objects, and:

>>> del a
>>> del b
>>> 

Nothing. The system does not delete either one. It's like they've fallen 
into another dimension: they are still there, they still take up space in 
memory, but because we have no name for them we cannot ever reach them 
from the outside world. (It would be interesting to see if we could create 
a thread running on a set of objects and then remove all of those objects 
from the initating part of the program, but that's beyond what i'm trying 
to explain here). In reality, cyclic problems would be much more subtle, 
and tend to slowly leak memory until (if the program runs long enough), 
the whole thing fails. (At which point it would probably all be freed as 
the interperter exits: it cannot delete it earlier because it is not 
gaurenteed to be safe for the program, but it still watches over things 
enough to prevent permanent system memory pollution).

The best analogy for the problem that I can think of is this. In every 
released version of the old Infocom text adventure Zork (one), there was a 
problem like this, but this is what comes to mind. It goes something like 
this:

>look
You are in some room. There is an inflated life raft and a heavy golden 
coffin here.
>put coffin in raft
[Done].
>look
You are in some room. There is an inflated life raft here. The raft seems 
to contain a heavy golden coffin.
>put raft in coffin
[Done].
>look
You are in some room.

Each object can only exist in one other object's contents list, so each 
put operation is essentially a move instead of a copy and delete (so much 
the difference =). The principle is the same, however.

> 2)  threads

Er... it's late. I better get going =).

(Shortest answer: they are a way of making the computer run your program 
in two places at once, at the same time. They are often used to do long 
(longer than 1/2 a second) processes in GUI based apps, so the core of the 
program can continue to respond to user interaction while the work gets 
done.)

> and why or when are they needed.
> 
> Threads I have not used but the use of weak references
> I have seen recommended with wxpython's grid component.

If you have used pointers in C or such, weak refs share many of the 
hassles (minus worrying about segfaults). But both of these things are at 
least moderatly advanced, so it's ok not to have used them yet. Threading 
especially can be tricky if you don't have everything else down pat-- it 
has enough trickyness to it when you do.

-- 
Philip Sw "Starweaver" [rasx] :: www.rubydragon.com