
Instances of classes inheriting from str, tuple, etc cannot be weakly referenced. Does anyone know the reason for this?
from weakref import proxy class Object(object): pass
x = Object() y = proxy(x) # no problem here class Str(str): pass
x = Str() y = proxy(x) # but why won't this work?
Traceback (most recent call last): File "<pyshell#9>", line 1, in -toplevel- y = proxy(x) TypeError: cannot create weak reference to 'Str' object Weak referencing would be much more useful to me if it included strings, tuples, and such. Right now, my awkward workaround is substituting UserString or some other wrapper class replacing inheritance with delegation. Raymond Hettinger

On May 29, 2004, at 1:06 PM, Raymond Hettinger wrote:
Instances of classes inheriting from str, tuple, etc cannot be weakly referenced. Does anyone know the reason for this?
They can not accept non-empty __slots__ either, which is probably closer to the source of the problem. I have no idea what the reason is. I imagine it's something to do with optimization, and/or because they are immutable. -bob

At 01:24 AM 5/30/04 -0400, Bob Ippolito wrote:
On May 29, 2004, at 1:06 PM, Raymond Hettinger wrote:
Instances of classes inheriting from str, tuple, etc cannot be weakly referenced. Does anyone know the reason for this?
They can not accept non-empty __slots__ either, which is probably closer to the source of the problem. I have no idea what the reason is. I imagine it's something to do with optimization, and/or because they are immutable.
More likely, because they are variably-sized, and their variably-sized portions are at fixed offsets. (OTOH, I don't see why their subclasses can still have a __dict__ slot, then...)

Bob Ippolito wrote:
On May 29, 2004, at 1:06 PM, Raymond Hettinger wrote:
Instances of classes inheriting from str, tuple, etc cannot be weakly referenced. Does anyone know the reason for this?
They can not accept non-empty __slots__ either, which is probably closer to the source of the problem. I have no idea what the reason is. I imagine it's something to do with optimization, and/or because they are immutable.
it is because they are var-sized objects. There is no place to put the weakref pointer in, since the variable part starts right at the beginning of the object. This is not really necessary, because strings are not so var-sized at all. After creation, they are fixed sized, and we could implement special cases for all such types, similar to what I did with type objects and slots. The property of "var-sized" objects is everything else but being var-sized. They are fixed sized after initialization, just that you cannot have fixed offsets at "compile time". The real var-sized objects are fixed-size :-)) , because they use an extra memory area to grow or shrink at runtime. I think slots could be added to all var-sized objects, and weakref is nothing else but kind of slot. The cost would be a little more computation and some special macro which points to the area "behind" the object. See my special case for meta types in Stackless typeobject.c. ciao - chris -- Christian Tismer :^) <mailto:tismer@stackless.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

[Raymond Hettinger]
Instances of classes inheriting from str, tuple, etc cannot be weakly referenced. Does anyone know the reason for this?
[Bob Ippolito]
They can not accept non-empty __slots__ either, which is probably closer to the source of the problem. I have no idea what the reason is. I imagine it's something to do with optimization, and/or because they are immutable.
[Christian Tismer]
it is because they are var-sized objects. There is no place to put the weakref pointer in, since the variable part starts right at the beginning of the object.
Thanks. That exactly identifies the issue. It also explains the mystery of why unicode subclasses are weakly referencable but str subclasses are not. For strings at least, perhaps it is time to bite the bullet and include weak reference support directly. Weak reference support ups the per string memory overhead from five words (ob_type, ob_refcnt, ob_size, ob_shash, ob_sstate) to six. The whole concept of weak dictionaries is much more useful when strings can be used as keys and/or values. Several other objects probably also warrant weak reference support: array.array, files, sockets, and sre.pattern_objects. In these cases, the one word overhead is small relative to the rest of the object Tuples would also be nice but the overhead is likely not worth it. Chris's solution (recognizing that var sized objects are typically fixed upon creation) is more general and doesn't burden the most common case. Raymond

For strings at least, perhaps it is time to bite the bullet and include weak reference support directly. Weak reference support ups the per string memory overhead from five words (ob_type, ob_refcnt, ob_size, ob_shash, ob_sstate) to six. The whole concept of weak dictionaries is much more useful when strings can be used as keys and/or values.
Hmm... it is a high price to pay to add another word (*and* some extra code at dealloc time!) to every string object when very few apps need them and strings are about the most common data type. And since they're immutable, what's the point of having weak refs to strings in the first place? (Note that the original poster asked about *subclasses* of strings.)
Several other objects probably also warrant weak reference support: array.array, files, sockets, and sre.pattern_objects. In these cases, the one word overhead is small relative to the rest of the object
These I have peace with. Note that for sockets, the objects that Python programs actually see are instances of a wrapper class defined in Python (using __slots__ so to add weakrefs, you have to add __weakreflist__ to the list of slots).
Tuples would also be nice but the overhead is likely not worth it. Chris's solution (recognizing that var sized objects are typically fixed upon creation) is more general and doesn't burden the most common case.
I'm not sure how this observation helps. Anyway, I have the same concern for tuples as for strings. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
For strings at least, perhaps it is time to bite the bullet and include weak reference support directly. Weak reference support ups the per string memory overhead from five words (ob_type, ob_refcnt, ob_size, ob_shash, ob_sstate) to six. The whole concept of weak dictionaries is much more useful when strings can be used as keys and/or values.
Hmm... it is a high price to pay to add another word (*and* some extra code at dealloc time!) to every string object when very few apps need them and strings are about the most common data type. And since they're immutable, what's the point of having weak refs to strings in the first place? (Note that the original poster asked about *subclasses* of strings.)
Same here. I wouldnot vote to make strings or tuples or any other tiny type weak-reffed in the first place. Instead I would add the possible support to derived types, via the __slot__ mechanism for instance. There is a little coding necessary to make the generic code handle the case of var-sized objects, but this is doable and not very complicated. This may be really needed or not. if we can create the rule "every derived type *can* have weak-refs", this is simpler to memorize than "well, most can, some cannot". cheers -- chris -- Christian Tismer :^) <mailto:tismer@stackless.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

Hmm... it is a high price to pay to add another word (*and* some extra code at dealloc time!) to every string object when very few apps need them and strings are about the most common data type. And since they're immutable, what's the point of having weak refs to strings in the first place? (Note that the original poster asked about *subclasses* of strings.)
Same here. I wouldnot vote to make strings or tuples or any other tiny type weak-reffed in the first place. Instead I would add the possible support to derived types, via the __slot__ mechanism for instance. There is a little coding necessary to make the generic code handle the case of var-sized objects, but this is doable and not very complicated.
Right. I think this is all that is needed at this point. That way, the granular types stay granular and the added functionality is available via subclasses if needed. Raymond

Same here. I wouldnot vote to make strings or tuples or any other tiny type weak-reffed in the first place. Instead I would add the possible support to derived types, via the __slot__ mechanism for instance. There is a little coding necessary to make the generic code handle the case of var-sized objects, but this is doable and not very complicated.
This may be really needed or not. if we can create the rule "every derived type *can* have weak-refs", this is simpler to memorize than "well, most can, some cannot".
With some (considerable?) effort, slots on var-sized objects can certainly be supported -- the same approach as used for __dict__ should work. I expect it might slow down the normal case a bit, unless you define a new descriptor type just for this. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Same here. I would not vote to make strings or tuples or any other tiny type weak-reffed in the first place. Instead I would add the possible support to derived types, via the __slot__ mechanism for instance. There is a little coding necessary to make the generic code handle the case of var-sized objects, but this is doable and not very complicated.
This may be really needed or not. if we can create the rule "every derived type *can* have weak-refs", this is simpler to memorize than "well, most can, some cannot".
With some (considerable?) effort, slots on var-sized objects can certainly be supported -- the same approach as used for __dict__ should work. I expect it might slow down the normal case a bit, unless you define a new descriptor type just for this.
Yes, I didn't recognize the tricky implementation until you said this. _PyObject_GetDictPtr looks at the end of the object if type->tp_dictoffset is < 0. This would become more complicated in the presence of weakref and other slots. Well, fairly, they would simply all store their offset from the end of the object. A not so costly approach could be to change slots a little: The slot offset is now computed on the beginning of the object's structure. There could be either a type flag that says "you must add an offset", or a new type slot which is either a function that computes and additional offset, or NULL. It would probably even suffice to check if varsize is nonzero and then do a little computation to bump the offset. cheers - chris -- Christian Tismer :^) <mailto:tismer@stackless.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

Raymond Hettinger wrote:
For strings at least, perhaps it is time to bite the bullet and include weak reference support directly. Weak reference support ups the per string memory overhead from five words (ob_type, ob_refcnt, ob_size, ob_shash, ob_sstate) to six. The whole concept of weak dictionaries is much more useful when strings can be used as keys and/or values.
Can you elaborate that? What specific, real-life application do you have in mind that requires strings to be weakly-referencable? Because strings are immutable, you often get unexpected references to them, so people have learned not to care about the life cycle of string objects. Regards, Martin

[Martin v. Löwis]
What specific, real-life application do you have in mind that requires strings to be weakly-referencable?
FWIW, my program searches a virtual graph for a goal state. Successor nodes (board positions) are computed by a complex function and stored as strings. The successions of moves were recorded with a dictionary linking the positions (long strings as both keys and values). When the graph eventually rules out a group of positions, I wanted the succession chains all disappear on their own (i.e. if a board position becomes unreachable when a better move is found, then the succession history for that position becomes irrelevant). Essentially, I used weak dictionaries as a cache to values that are expensive to compute and take up (collectively) a lot of memory. Having that cache weakly referenced automated the process of clearing out values that lost their relevance. IMO, this use case is too rare to warrant adding weak ref support to strings. It just served to precipitate an attempted workaround using "class Str(str): pass". Christian then clarified why that didn't work as expected.
In addition to the reason Christian gave, one (conceptually more important) reason is that strings can't participate in cycles. Weak references were introduced as a mechanism to avoid creating cyclic structures, so that "backward" links could be made weak references.
Right, that was a principal use case. Hence, the focus should be on tuples rather than strings. Likewise, most containers should be candidates for weakref support unless it is just too expensive (i.e. lists and tuples). Also, any subclass can potentially introduce back references, so all subclasses should be fair game. Still, if cycles were the only issue, we would just let GC take care of it. There is one other purpose for weakrefs. Even in the absence of cycles, weakrefs let applications to cache and track objects without creating a strong reference that keeps objects alive forever. This is especially useful for objects that were expensive to compute, expensive to create, or external to the program in some way. That was the rationale regexp patterns, files, sockets, and arrays.
I can't see a reason why Unicode objects should be weakly referencable (just as I can't see a reason for plain strings).
Unicode objects are like list and dict objects. None of them are directly weak referencable. They have to be subclassed. Then, they need weak referencability as much as any other instance would. Raymond

Raymond Hettinger wrote:
Instances of classes inheriting from str, tuple, etc cannot be weakly referenced. Does anyone know the reason for this?
In addition to the reason Christian gave, one (conceptually more important) reason is that strings can't participate in cycles. Weak references were introduced as a mechanism to avoid creating cyclic structures, so that "backward" links could be made weak references. It appears that people have then been eager to add weakref support to other datatypes. IMO, they have been too eager. For example, I can't see a reason why Unicode objects should be weakly referencable (just as I can't see a reason for plain strings). Regards, Martin
participants (7)
-
"Martin v. Löwis"
-
Bob Ippolito
-
Christian Tismer
-
Guido van Rossum
-
Phillip J. Eby
-
Raymond Hettinger
-
Raymond Hettinger