[Python-Dev] urllib2 EP + decr. startup time

Phillip J. Eby pje at telecommunity.com
Fri Feb 16 17:57:18 CET 2007


At 04:38 PM 2/16/2007 +0200, KoDer wrote:
>    'strace' command shows next: most of startup time the interpreter
>try to find imported modules.
>    And most of them finished with 'not found' error, because of large
>size of sys.path variable.
>    In future this time will be increase - setuptools adds many dirs to
>search path
>    using pth files (to manage installed modules and eggs).

Actually, under normal circumstances, most eggs installed are .zip files, 
which the interpreter already caches the indexes of.  Eggs installed as 
directories should be increasing in rarity, except for in-development 
packages installed via the "develop" command.  Also, I plan to make 
setuptools 0.7's "nest" packaging tool support managing packages the "old" 
way, i.e., as a single flat directory structure, using manifest files to 
support uninstallation and the like.  So it should not really be the case 
that this will keep increasing over time.

Also, are you aware that putting a zipped version of the standard library 
on sys.path already speeds up startup considerably?  Python since 2.3 
automatically includes an appropriate entry in sys.path:

Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on win32
 >>> import sys
 >>> sys.path
['', 'C:\\WINDOWS\\system32\\python23.zip', '', 'c:\\Python23\\DLLs', 
'c:\\Pytho
n23\\lib', 'c:\\Python23\\lib\\plat-win', 'c:\\Python23\\lib\\lib-tk', 
'c:\\Pyth
on23']

Creating the zip file that's already listed in the default sys.path will 
allow most startup imports to be handled without a lot of path checking.


>    I propose to add something like .so caching which used in modern
>*nix sytems to load
>    shared libraries.
>
>    a) Add to python interpreter --build-modules-index option. When python 
> found
>    this opts it scans all dirs in paths and build dictionary
>{module_name:module_path}.
>    Dict will be saved in external file (save only top-dir for packages
>and path for one-file modules).
>    Also it saves in this file mtime for all pth files and dirs from
>path and path variable.

Unless you mean something more abstract by "dirs" than just filesystem 
directories, this isn't going to help eggs or other zipped modules any, 
compared to how they are now.



More information about the Python-Dev mailing list