Upgrading Python Breaks Extensions; Fix proposal

Ignacio Vazquez-Abrams ignacio at openservices.net
Wed Aug 29 12:31:24 EDT 2001


On Wed, 29 Aug 2001, Neil Schemenauer wrote:

> Skip Montanaro wrote:
> >
> >     John> Under Unix, each program is its own address space, and .so modules
> >     John> are private resources once they're loaded.
> >
> > I might be reading this wrong, and I'm definitely taking you out of context,
> > but the whole idea of .so files is that their text sections *are* shared
> > among processes.  Their data sections will be process-private.
>
> That's an optimization done at the OS level however.  The process does
> not see it.  I think (perhaps incorrectly) that John is saying that the
> sharing is done in a less transparent fashion on Windows.  Can someone
> confirm?
>
> I don't know how this relates to the original question however.
>
>   Neil

The way Windows does it seems (I have more experience with Windows binding
than Unix binding) to be exactly the same. Either you have a .lib that tells
you exactly what ordinal a given function has (a la .so-searching done by ld,
only a LOT less transparent in C under Windows, only because you need the
.lib), or you can you GetProcFuncInstanceByName() (or SOMETHING like that) the
same way you use dlsym().

The original question was getting rid of Windows' pythonXY.dll versioning
incompatibilities in C extensions. The way Unix does it is to force the
application with the interpreter to link with
/usr/lib/pythonX.Y/libpythonX.Y.a and then make global all symbols within it.
I had suggested doing something similar for Windows. I got a response
from John Roth <johnroth at ameritech.net> which I wasn't fully happy with:

"The design goals for Unix and Windows were totally different. Under Unix,
each program is its own address space, and .so modules are private resources
once they're loaded. Under Windows, they are shared resources - the same
physical memory copy can be used simultaniously by multiple users, and appears
in multiple address spaces. Consequently, it's impossible for them to link
back. Also, much of the thinking that went into the mechanisms came from
CORBA, where they wanted to run an application on one system, and link to a
module on another system."

The way DLLs work is that the GetWhateverByName() function sets up private
thunks to a shared CS for the DLL, and when an application wants to call a
function in a DLL, it ends up transparently calling the thunk instead.

Now a "thunk" is a small bit of code that serves two purposes:

1) load the .text segment is not in memory if it has been paged out, and
2) set DS to a private value based on the calling code.

Now here's where it gets interesting. The DS for a DLL is copy-on-write, so if
a change is made in it, an application gets its own private copy per DLL.
Incidentally, global variables in C are stored in DS.

SO...

I propose that we make the following changes to how Windows C extensions are
built:

1) make some sort of 'thunking.h' or whatnot that contains global function
     pointers for 'PyMem_Malloc', 'PyMem_Free', etc., and
2) make a pythonDLLX.Y.lib that contains a DllMain() that thunks the functions
     from the application (which has previously linked python.X.Y.lib) and
     sets the global variables in thunking.h.

That way the behaviour of the Windows C extensions matches that of the Unix C
extensions and pythonXY.dll can be done away with.

-- 
Ignacio Vazquez-Abrams  <ignacio at openservices.net>









More information about the Python-list mailing list