Windows and PyObject_NEW

For MarkH, Guido and the Windows experienced: I've been reading Jeffrey Richter's "Advanced Windows" last night in order to try understanding better why PyObject_NEW is implemented differently for Windows. Again, I feel uncomfortable with this, especially now, when I'm dealing with the memory aspect of Python's object constructors/desctrs. Some time ago, Guido elaborated on why PyObject_NEW uses malloc() on the user's side, before calling _PyObject_New (on Windows, cf. objimpl.h): [Guido]
While I agree with this, from reading chapters 5-9 of (a French copy of) the book (translated backwards here): 5. Win32 Memory Architecture 6. Exploring Virtual Memory 7. Using Virtual Memory in Your Applications 8. Memory Mapped Files 9. Heaps I can't find any radical Windows specificities for memory management. On Windows, like the rest of the OSes, the (virtual & physical) memory allocated for a process is common and seem to be accessible from all DDLs involved in an executable. Things like page sharing, copy-on-write, private process mem, etc. are conceptually all the same on Windows and Unix. Now, the backwards binary compatibility argument aside (assuming that extensions get recompiled when a new Python version comes out), my concern is that with the introduction of PyObject_NEW *and* PyObject_DEL, there's no point in having separate implementations for Windows and Unix any more (or I'm really missing something and I fail to see what it is). User objects would be allocated *and* freed by the core DLL (at least the object headers). Even if several DLLs use different allocators, this shouldn't be a problem if what's obtained via PyObject_NEW is freed via PyObject_DEL. This Python memory would be allocated from the Python's core DLL regions/pages/heaps. And I believe that the memory allocated by the core DLL is accessible from the other DLL's of the process. (I haven't seen evidence on the opposite, but tell me if this is not true) I thought that maybe Windows malloc() uses different heaps for the different DLLs, but that's fine too, as long as the _NEW/_DEL symmetry is respected and all heaps are accessible from all DLLs (which seems to be the case...), but: In the beginning of Chapter 9, Heaps, I read the following: """ ...About Win32 heaps (compared to Win16 heaps)... * There is only one kind of heap (it doesn't have any particular name, like "local" or "global" on Win16, because it's unique) * Heaps are always local to a process. The contents of a process heap is not accessible from the threads of another process. A large number of Win16 applications use the global heap as a way of sharing data between processes; this change in the Win32 heaps is often a source of problems for porting Win16 applications to Win32. * One process can create several heaps in its addressing space and can manipulate them all. * A DLL does not have its own heap. It uses the heaps as part of the addressing space of the process. However, a DLL can create a heap in the addressing space of a process and reserve it for its own use. Since several 16-bit DLLs share data between processes by using the local heap of a DLL, this change is a source of problems when porting Win16 apps to Win32... """ This last paragraph confuses me. On one hand, it's stated that all heaps can be manipulated by the process, and OTOH, a DLL can reserve a heap for personal use within that process (implying the heap is r/w protected for the other DLLs ?!?). The rest of this chapter does not explain how this "private reservation" is or can be done, so some of you would probably want to chime in and explain this to me. Going back to PyObject_NEW, if it turns out that all heaps are accessible from all DLLs involved in the process, I would probably lobby for unifying the implementation of _PyObject_NEW/_New and _PyObject_DEL/_Del for Windows and Unix. Actually on Windows, object allocation does not depend on a central, Python core memory allocator. Therefore, with the patches I'm working on, changing the core allocator would work (would be changed for real) only for platforms other than Windows. Next, ff it's possible to unify the implementation, it would also be possible to expose and officialize in the C API a new function set: PyObject_New() and PyObject_Del() (without leading underscores) For now, due to the implementation difference on Windows, we're forced to use the macro versions PyObject_NEW/DEL. Clearly, please tell me what would be wrong on Windows if a) & b) & c): a) we have PyObject_New(), PyObject_Del() b) their implementation is platform independent (no MS_COREDLL diffs, we retain the non-Windows variant) c) they're both used systematically for all object types -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252

Vladimir Marangozov
This is true. Or, I should say, it all boils down to HeapAlloc( heap, flags, bytes) and malloc is going to use the _crtheap.
At any time, you can creat a new Heap handle HeapCreate(options, initsize, maxsize) Nothing special about the "dll" context here. On Win9x, only someone who knows about the handle can manipulate the heap. (On NT, you can enumerate the handles in the process.) I doubt very much that you would break anybody's code by removing the Windows specific behavior. But it seems to me that unless Python always uses the default malloc, those of us who write C++ extensions will have to override operator new? I'm not sure. I've used placement new to allocate objects in a memory mapped file, but I've never tried to muck with the global memory policy of C++ program. - Gordon

On Sat, 25 Mar 2000, Gordon McMillan wrote:
Actually, the big problem arises when you have debug vs. non-debug DLLs. malloc() uses different heaps based on the debug setting. As a result, it is a bad idea to call malloc() from a debug DLL and free() it from a non-debug DLL. If the allocation pattern is fixed, then things may be okay. IF. Cheers, -g -- Greg Stein, http://www.lyra.org/

Sorry for the delay, but Gordon's reply was accurate so should have kept you going ;-)
So that is where the heaps discussion came from :-) The problem is simply "too many heaps are available".
It is this exact reason it was added in the first place. I believe this code predates the "_d" convention on Windows. AFAIK, this could could be removed today and everything should work (but see below why it probably wont) MSVC allows you to choose from a number of CRT versions. Only in one of these versions is the CRTL completely shared between the .EXE and all the various .DLLs in the application. What was happening is that this macro ended up causing the "malloc" for a new object to occur in Python15.dll, but the Python type system meant that tp_dealloc() (to cleanup the object) was called in the DLL implementing the new type. Unless Python15.dll and our extension DLL shared the same CRTL (and hence the same malloc heap, fileno table etc) things would die. The DLL version of "free()" would complain, as it had never seen the pointer before. This change meant the malloc() and the free() were both implemented in the same DLL/EXE This was particularly true with Debug builds. MSVC's debug CRTL implementations have some very nice debugging features (guard-blocks, block validity checks with debugger breapoints when things go wrong, leak tracking, etc). However, this means they use yet another heap. Mixing debug builds with release builds in Python is a recipe for disaster. Theoretically, the problem has largely gone away now that a) we have seperate "_d" versions and b) the "official" postition is to use the same CRTL as Python15.dll. However, is it still a minor FAQ on comp.lang.python why PyRun_ExecFile (or whatever) fails with mysterious errors - the reason is exactly the same - they are using a different CRTL, so the CRTL can't map the file pointers correctly, and we get unexplained IO errors. But now that this macro hides the malloc problem, there may be plenty of "home grown" extensions out there that do use a different CRTL and dont see any problems - mainly cos they arent throwing file handles around! Finally getting to the point of all this: We now also have the PyMem_* functions. This problem also doesnt exist if extension modules use these functions instead of malloc()/free(). We only ask them to change the PyObject allocations and deallocations, not the rest of their code, so it is no real burden. IMO, we should adopt these functions for most internal object allocations and the extension samples/docs. Also, we should consider adding relevant PyFile_fopen(), PyFile_fclose() type functions, that simply are a thin layer over the fopen/fclose functions. If extensions writers used these instead of fopen/fclose we would gain a few fairly intangible things - lose the minor FAQ, platforms that dont have fopen at all (eg, CE) would love you, etc. Mark.

Vladimir Marangozov
This is true. Or, I should say, it all boils down to HeapAlloc( heap, flags, bytes) and malloc is going to use the _crtheap.
At any time, you can creat a new Heap handle HeapCreate(options, initsize, maxsize) Nothing special about the "dll" context here. On Win9x, only someone who knows about the handle can manipulate the heap. (On NT, you can enumerate the handles in the process.) I doubt very much that you would break anybody's code by removing the Windows specific behavior. But it seems to me that unless Python always uses the default malloc, those of us who write C++ extensions will have to override operator new? I'm not sure. I've used placement new to allocate objects in a memory mapped file, but I've never tried to muck with the global memory policy of C++ program. - Gordon

On Sat, 25 Mar 2000, Gordon McMillan wrote:
Actually, the big problem arises when you have debug vs. non-debug DLLs. malloc() uses different heaps based on the debug setting. As a result, it is a bad idea to call malloc() from a debug DLL and free() it from a non-debug DLL. If the allocation pattern is fixed, then things may be okay. IF. Cheers, -g -- Greg Stein, http://www.lyra.org/

Sorry for the delay, but Gordon's reply was accurate so should have kept you going ;-)
So that is where the heaps discussion came from :-) The problem is simply "too many heaps are available".
It is this exact reason it was added in the first place. I believe this code predates the "_d" convention on Windows. AFAIK, this could could be removed today and everything should work (but see below why it probably wont) MSVC allows you to choose from a number of CRT versions. Only in one of these versions is the CRTL completely shared between the .EXE and all the various .DLLs in the application. What was happening is that this macro ended up causing the "malloc" for a new object to occur in Python15.dll, but the Python type system meant that tp_dealloc() (to cleanup the object) was called in the DLL implementing the new type. Unless Python15.dll and our extension DLL shared the same CRTL (and hence the same malloc heap, fileno table etc) things would die. The DLL version of "free()" would complain, as it had never seen the pointer before. This change meant the malloc() and the free() were both implemented in the same DLL/EXE This was particularly true with Debug builds. MSVC's debug CRTL implementations have some very nice debugging features (guard-blocks, block validity checks with debugger breapoints when things go wrong, leak tracking, etc). However, this means they use yet another heap. Mixing debug builds with release builds in Python is a recipe for disaster. Theoretically, the problem has largely gone away now that a) we have seperate "_d" versions and b) the "official" postition is to use the same CRTL as Python15.dll. However, is it still a minor FAQ on comp.lang.python why PyRun_ExecFile (or whatever) fails with mysterious errors - the reason is exactly the same - they are using a different CRTL, so the CRTL can't map the file pointers correctly, and we get unexplained IO errors. But now that this macro hides the malloc problem, there may be plenty of "home grown" extensions out there that do use a different CRTL and dont see any problems - mainly cos they arent throwing file handles around! Finally getting to the point of all this: We now also have the PyMem_* functions. This problem also doesnt exist if extension modules use these functions instead of malloc()/free(). We only ask them to change the PyObject allocations and deallocations, not the rest of their code, so it is no real burden. IMO, we should adopt these functions for most internal object allocations and the extension samples/docs. Also, we should consider adding relevant PyFile_fopen(), PyFile_fclose() type functions, that simply are a thin layer over the fopen/fclose functions. If extensions writers used these instead of fopen/fclose we would gain a few fairly intangible things - lose the minor FAQ, platforms that dont have fopen at all (eg, CE) would love you, etc. Mark.
participants (4)
-
Gordon McMillan
-
Greg Stein
-
Mark Hammond
-
Vladimir.Marangozov@inrialpes.fr