[Pyobjc-dev] Re: [Python-Dev] Bridging strings from Python to other languages

Guido van Rossum guido@python.org
Wed, 05 Feb 2003 16:27:51 -0500


> On Wednesday, Feb 5, 2003, at 10:52 US/Eastern, Guido van Rossum wrote:
> > In my experience almost no Python code depends on this property, and
> > it seems to be the most problematic one.  So why is this a
> > requirement?

[BBum]
> Because we are using Python to glue together other object oriented 
> frameworks-- Apple's and third party's-- for which we do not have 
> control over said behavior.
> 
> Sometimes an object is just an object and those frameworks insist on 
> the same object-- same identifier/address-- coming out that went in.
> 
> As Just pointed out, my original example wasn't as clear as it could 
> have been.   If a String object comes out of the alien-to-python world 
> and is later sent from python back into the alien-to-python runtime, 
> the same String object-- the same id()-- must be sent back.

Ah.  I still don't know what you call "alien" and what you call
"native".  But I think that I understood your original example as
going the other direction (Python -> ObjC -> Python) while the issue
really is ObjC -> Python -> ObjC.

> > If you can live with only using Unicode strings (even when all they
> > contain is ASCII or Latin-1 values), I think subclassing Unicode might
> > be the way to go.
> 
> Right.  I believe that is the path will we go down.
> 
> > I don't have time to dig deeper into this.  But if you think a small
> > change to Python can make life easier for you, I expect we'll be happy
> > to implement it, as long as it doesn't make life harder for Python
> > developers.
> 
> I can think of a couple of changes to Python that would be potentially 
> quite helpful in this situation.
> 
> Specifically:
> 
>      - ability to have weak references to string objects [and unicode 
> objects].   Since we can make arbitrary object associations and 
> re-associations when crossing the bridge between environments, I 
> believe weakref would allow us to maintain a reference map as long as 
> we could grab the 'string is now going away' callback to update the 
> weakref map when the string is deallocated

Would you be okay with only weak refs to unicode objects?

Either way, you have to start by submitting a patch (referring to this
thread).

>      - ability to subclass string objects or the ability to add a hunk 
> of data-- the reference to the 'alien' string object-- to any string 
> object.

You can do that in C, if you know how.  The trick is to set
tp_itemsize to 4 bytes extra, and then index from the end (rounding
down).  See how _PyObject_GetDictPtr() works when the dict offset is
negative (in practice it will always be -4).

> Either one would work equally as well... whether or not they are easy 
> to do, I have not a clue.

Who knows.

--Guido van Rossum (home page: http://www.python.org/~guido/)