[Distutils] Standardizing distribution of "plugins" for extensible apps

Wed Dec 8 18:19:37 CET 2004

On Dec 8, 2004, at 10:06 AM, Phillip J. Eby wrote:

> At 09:37 AM 12/8/04 -0500, Bob Ippolito wrote:
>> On Dec 8, 2004, at 8:35, Phillip J. Eby wrote:
>>> At 11:50 PM 12/7/04 -0500, Bob Ippolito wrote:
>>>> Also, if you allow extraction of C extension modules, you'll 
>>>> probably also have to allow extraction of dependent dlls and 
>>>> whatnot.. which is a real mess.  For dependent dynamic libraries on 
>>>> Darwin, this is a *real* mess, because the runtime linker is only 
>>>> affected by environment variables at process startup time.  I can 
>>>> only think of two solutions to this for Darwin:
>>>> (a) build an executable bundle on and execve it on process startup, 
>>>> drop the dependent libraries inside that executable bundle
>>>> (b) have some drop location for dependent libraries and C 
>>>> extensions and rewrite the load commands in them before loading 
>>>> (which may fail if there isn't enough wiggle room in the header)
>>>
>>> With regard to 'b', I'm not quite sure I understand about the 
>>> rewriting load commands.  Are you saying that on Darwin, you have no 
>>> LD_LIBRARY_PATH?  Because, wouldn't it suffice for the application 
>>> to have that defined when it starts, and install the libraries on 
>>> that path?  What am I missing, here?
>>
>> Load commands (runtime dependencies between Mach-O files) have full 
>> paths embedded in them, not just names, so that is why header 
>> rewriting is useful.  If these load commands start with 
>> "@executable_path/", then the first place the library is looked for 
>> will be relative to the executable, which makes (a) possible and is 
>> what py2app already does when you start including dependencies.
>
> Okay, now I'm really confused.  How the heck does Python manage to 
> load dynamic stuff at all, if everything has to have absolute paths in 
> them?  Can you use load commands relative to the location of the 
> library itself?  And who designed this crazy thing?  ;)

No you can't use load commands relative to a library, only the process' 
main executable, or absolute paths.  I certainly didn't design it :)  I 
think that library-relative load commands would be terribly useful, and 
have filed feature requests.. but I'm not so sure it's going to be 
implemented by Mac OS X 10.4.

There are essentially three ways to reference an external symbol with 
dyld (assuming two-level namespaces):
(a) directly, by specifying that the symbol "foo" is going to be in the 
image from a particular load command, crash if the symbol or image is 
not found
(b) weakly, by specifying that the symbol "foo" is going to be in the 
image from a particular load command, set the symbol to NULL if not 
found
(c) indirectly, by specifying that the symbol "foo" is hopefully 
already defined in the process by something else, crash if not found

The best way to link Python extensions is to use (c), but that feature 
of dyld was not implemented until Mac OS X 10.3, and was not used by 
Python until 2.4.  I'll probably submit a patch to make it work like 
this for 2.3.5 if I find the time.

I'm not sure if this will help you better understand, but the dyld_find 
function in this Python module clones the load command resolution 
algorithm of dyld:
http://svn.red-bean.com/bob/py2app/trunk/src/macholib/dyld.py

The main reason these absolute paths are there are because Darwin has 
namespaces for symbols.  You can load two different versions of the 
same thing just fine, so long as you are only looking up symbols 
directly from them.  There's also a feature called prebinding that 
depends on this, essentially if it can confirm that the libraries have 
not changed, then it can do a lot of the symbol mapping stuff at "link 
time" to make executables start up faster.  I say link time in quotes 
because it can be updated (if you upgrade a dependent library, for 
example).

>>> IOW, if you have a directory set up on LD_LIBRARY_PATH or its 
>>> equivalent, can't you just dump the libraries and C extensions 
>>> there?
>>
>> Darwin has a pair of environment variables that are sort-of 
>> equivalent to LD_LIBRARY_PATH, however, their values are cached by 
>> the runtime linker (dyld) as soon as a process starts.
>
> I meant, couldn't a given application instance just say, "okay, this 
> is where I'm going to put my libraries", and have the environment 
> variable set before it starts?  That way, it could add new stuff to 
> that directory at runtime without needing to restart.
>
> I suppose if the path is relative to some executable, then you could 
> still do that at runtime.

How does an application say something about what it wants before it 
starts?  Do we expect every application developer to write a 
darwin-specific boot script?  Do we force a "boot" script (think 
something like Twisted's "twistd") for every platform so that things 
like this can be accommodated?

>> How about we include a manifest file that includes filename, size, 
>> and a hash of the file's contents, and has the author's public key in 
>> there somewhere at the top or bottom.  A second file, or a SMIME or 
>> PGP style wrapper around the manifest file, will contain the hash of 
>> the manifest file that is signed by the author's private key.
>
> I like this.  Specifically, I like the part that it's a separate and 
> optional file, so it doesn't  hold up the base format definition.  We 
> just need to be able to define how metadata files like this get 
> included in the format, so that other metadata files (like a Chandler 
> Parcel schema, or a Zope ZCML file) would be includable also.  Then, 
> the bdist_plugin command would just package up those files, possibly 
> after optionally generating the signature manifest.

What about something like this:

myplugin.pyplugin
     metadata/
         MANIFEST-1.0
     share/
         mypackage.zcml
     purelib/
         mypackage/
             __init__.py
     platlib/
         os-and-python-version-specific-string/
             mypackage/
                 extmodule.so
     lib/
         os-and-python-version-specific-string/
             extmoduledependency.dylib

This should more or less allow for someone to create a "fat" plugin 
that has platform-specific dependencies, but includes them for multiple 
platforms.

We should also say that the filenames in the zip file should encoded as 
utf-8, so we can support unicode filenames.  The zip format itself has 
no standard for this, and there isn't even a de facto standard.

-bob