Re: [Python-Dev] Minor compilation problem on HP-UX (1.6b1) (fwd)
![](https://secure.gravatar.com/avatar/a7078ab39cb3059f5289a0eff4ce4c8a.jpg?s=120&d=mm&r=g)
+ #ifdef __hpux + mallopt (M_MXFAST, 512); + #endif /* __hpux */ +
After reading this I went off and actually _read_ the mallopt manpage for the first time in my life, and it seems there's quite a few parameters there we might want to experiment with. Besides the M_MXFAST there's also M_GRAIN, M_BLKSIZ, M_MXCHK and M_FREEHD that could have significant impact on Python performance. I know that all the tweaks and tricks I did in the MacPython malloc implementation resulted in a speedup of 20% or more (including the cache-aligment code in dictobject.c). -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm
![](https://secure.gravatar.com/avatar/483279ce35ed87a7c8070ecae0e6abb7.jpg?s=120&d=mm&r=g)
Jack Jansen wrote:
+ #ifdef __hpux + mallopt (M_MXFAST, 512); + #endif /* __hpux */ +
After reading this I went off and actually _read_ the mallopt manpage for the first time in my life, and it seems there's quite a few parameters there we might want to experiment with. Besides the M_MXFAST there's also M_GRAIN, M_BLKSIZ, M_MXCHK and M_FREEHD that could have significant impact on Python performance. I know that all the tweaks and tricks I did in the MacPython malloc implementation resulted in a speedup of 20% or more (including the cache-aligment code in dictobject.c).
To start with, try the optional object malloc I uploaded yestedray at SF. [Patch #101104] Tweaking mallopt and getting 20% speedup for some scripts is no surprise at all. For me <wink>. It is not portable though. -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
![](https://secure.gravatar.com/avatar/a7078ab39cb3059f5289a0eff4ce4c8a.jpg?s=120&d=mm&r=g)
Don't worry, Vladimir, I hadn't forgotten your malloc stuff:-) Its just that if mallopt is available in the standard C library this may be a way to squeeze out a couple of extra percent of performance that the admin who installs Python needn't be aware of. And I don't think your allocator can be dropped in to the standard distribution, because it has the potential problem of fragmenting the heap due to multiple malloc packages in one address space (at least, that was the problem when I last looked at it, which is admittedly more than a year ago). And about mallopt not being portable: right, but I would assume that something like #ifdef M_MXFAST mallopt(M_MXFAST, xxxx); #endif shouldn't do any harm if we set xxxx to be a size that will cause 80% or so of the python objects to fall into the M_MXFAST category (sizeof(PyObject)+sizeof(void *), maybe?). This doesn't sound platform-dependent... Similarly, M_FREEHD sounds like it could speed up Python allocation, but this would need to be measured. Python allocation patterns shouldn't be influenced too much by platform, so again if this is good on one platform it is probably good on all. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm
![](https://secure.gravatar.com/avatar/483279ce35ed87a7c8070ecae0e6abb7.jpg?s=120&d=mm&r=g)
Jack Jansen wrote:
Don't worry, Vladimir, I hadn't forgotten your malloc stuff:-)
Me? worried about mallocs? :-)
if mallopt is available in the standard C library this may be a way to squeeze out a couple of extra percent of performance that the admin who installs Python needn't be aware of.
As long as you're maintaining a Mac-specific port of Python, you can do this without pbs on the Mac port.
And I don't think your allocator can be dropped in to the standard distribution, because it has the potential problem of fragmenting the heap due to multiple malloc packages in one address space (at least, that was the problem when I last looked at it, which is admittedly more than a year ago).
Things have changed since then. Mainly on the Python side. Have a look again.
And about mallopt not being portable: right, but I would assume that something like #ifdef M_MXFAST mallopt(M_MXFAST, xxxx); #endif shouldn't do any harm if we set xxxx to be a size that will cause 80% or so of the python objects to fall into the M_MXFAST category
Which is exactly what pymalloc does, except that this applies for > 95% of all allocations.
(sizeof(PyObject)+sizeof(void *), maybe?). This doesn't sound platform-dependent...
Indeed, I also use this trick to tune automatically the object allocator for 64-bit platforms. I haven't tested it on such machines as I don't have access to them, though. But it should work.
Similarly, M_FREEHD sounds like it could speed up Python allocation, but this would need to be measured. Python allocation patterns shouldn't be influenced too much by platform, so again if this is good on one platform it is probably good on all.
I am against any guesses in this domain. Measures and profiling evidence: that's it. Being able to make lazy decisions about Python's mallocs is our main advantage. Anything else is wild hype <0.3 wink>. -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
participants (2)
-
Jack Jansen
-
Vladimir.Marangozov@inrialpes.fr