[Pythonmac-SIG] Re: Startup time

Jack Jansen Jack.Jansen@cwi.nl
Tue, 19 May 1998 14:13:52 +0200


Rob sent me some profiler output showing that an almost empty Python script 
spent about 30% of its runtime in find_module() (of which half was spent in 
PyMac_FindResourceModule). Since 30% was spent in the yield code (and hence 
given to other applications) this means that half the startup delay can be 
accounted to import module lookup. I'm sending this reply to the whole group, 
maybe someone else has some nifty ideas for improvement?

> Since I am bothering you I thought I would pass on some profiling info.
> This is crude but may help. I have attached 2 text files (tab delimited).
> They are sorted by tim espent in a routine and time spent in a routine
> including all children.
> 
> I ran a test script that printed one line to the screen and quit.
> Find_module and Py_FindResourceModule seem to take the most time. Not sure
> what it means though.

There's little to be done about that (at least: little that I can think of, so 
ideas are welcome).

find_module has to loop over sys.path trying all extensions to see if the file 
exists. This is costly on the macintosh. I think there must be a faster way 
(similar to what "find file" in the finder uses, which is pretty quick), but I 
haven't had time to investigate. If you have ideas on this I'd definitely like 
to hear them. Import time is probably one of the most time-consuming tasks in 
Python: I have a frozen version of our multimedia editor (about 150 modules 
all together) that starts in about 10 seconds while the non-frozen version 
takes about a minute on the same machine.

The PyMac_FindResourceModule is called once on every sys.path component for 
each import. It checks whether the sys.path component is a file in stead of a 
folder, and, if so, checks whether it has a PYC resource of the correct name. 
All its runtime is collected in the first few imports, though: if string 
interning is enabled (and it is, in MacPython) it will remember that a certain 
sys.path component is a folder and do a quick exit the next time. So then the 
overhead is only a call and a couple of pointer compares for each sys.path 
component.

The effect that the first few calls to FindResourceModule take more time is 
especially notable since the new exception code in Python 1.5: now the 
interpreter has to do some imports early in initialization.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@cwi.nl      | ++++ if you agree copy these lines to your sig ++++
http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm