What is the reason for _sre.pyd being a separate DLL? On Unix, it is incorporated into the executable by default; regular expressions are central for Python and cannot be omitted. Would anybody object if I change the Windows build process so that it stops having _sre as a separate target? Regards, Martin
What is the reason for _sre.pyd being a separate DLL? On Unix, it is incorporated into the executable by default; regular expressions are central for Python and cannot be omitted.
Would anybody object if I change the Windows build process so that it stops having _sre as a separate target?
Let me turn this around. What advantage do you see to linking it statically? --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum <guido@python.org> writes:
Let me turn this around. What advantage do you see to linking it statically?
The trigger was that it would have simplified the build for me: When converting VC++6 projects to VC.NET, VC.NET forgets to convert the /export: linker options, which means that you had to add them all manually. Mark has fixed this problem differently, by removing the need for /export:. Integrating _sre (and _socket, select, winreg, mmap, perhaps others) into python.dll still simplifies the build process: you don't have to right-click that many subprojects to build them. In addition, it should decrease startup time: Python won't need to locate that many files anymore. It also decreases the total size of the binary distribution slightly. Regards, Martin
Let me turn this around. What advantage do you see to linking it statically?
The trigger was that it would have simplified the build for me: When converting VC++6 projects to VC.NET, VC.NET forgets to convert the /export: linker options, which means that you had to add them all manually. Mark has fixed this problem differently, by removing the need for /export:.
Integrating _sre (and _socket, select, winreg, mmap, perhaps others) into python.dll still simplifies the build process: you don't have to right-click that many subprojects to build them.
I never have to do that; the dependencies in the project file make sure that the extensions are all built when you build the 'python' project.
In addition, it should decrease startup time: Python won't need to locate that many files anymore.
It also decreases the total size of the binary distribution slightly.
Maybe _sre is used by most apps (though I doubt even that). But _socket, select, winreg, mmap and the others are definitely not. On Unix, all extensions are built as shared libraries, except the ones that are needed by setup.py to be able to build extensions; it looks like only posix, errno, _sre and symtable are built statically. I'd say that making more extensions static on Windows would increase start time of modules that don't use those extensions. I'm -0 on doing this for _sre (I think it's a YAGNI); I'm -1 on doing this for other extensions. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum <guido@python.org> writes:
I never have to do that; the dependencies in the project file make sure that the extensions are all built when you build the 'python' project.
Are you sure? If the python target is up-to-date (i.e. nothing has to be done for python_d.exe), and I delete all generated _sre files (i.e. sre_d.pyd, and the object files), and then ask VC++ 6 to build the python target, nothing is done. Indeed, I cannot find any place where it says that the python target is related to _sre. I can only see dependencies with pythoncore. Can you (or any other regular pcbuild.dsp user) please guess what I'm doing wrong?
Maybe _sre is used by most apps (though I doubt even that). But _socket, select, winreg, mmap and the others are definitely not. On Unix, all extensions are built as shared libraries, except the ones that are needed by setup.py to be able to build extensions; it looks like only posix, errno, _sre and symtable are built statically.
I do believe that is a mistake, as it will increase startup time of applications that need them; applications that don't need them would not be hurt if they were in the python binary.
I'd say that making more extensions static on Windows would increase start time of modules that don't use those extensions.
I guess I have to measure these things. Regards, Martin
I never have to do that; the dependencies in the project file make sure that the extensions are all built when you build the 'python' project.
Are you sure? If the python target is up-to-date (i.e. nothing has to be done for python_d.exe), and I delete all generated _sre files (i.e. sre_d.pyd, and the object files), and then ask VC++ 6 to build the python target, nothing is done.
Indeed, I cannot find any place where it says that the python target is related to _sre. I can only see dependencies with pythoncore.
Can you (or any other regular pcbuild.dsp user) please guess what I'm doing wrong?
I have no idea. It's all magic for me. But I never delete targets manually.
Maybe _sre is used by most apps (though I doubt even that). But _socket, select, winreg, mmap and the others are definitely not. On Unix, all extensions are built as shared libraries, except the ones that are needed by setup.py to be able to build extensions; it looks like only posix, errno, _sre and symtable are built statically.
I do believe that is a mistake, as it will increase startup time of applications that need them; applications that don't need them would not be hurt if they were in the python binary.
But is the startup time of apps that use a lot of stuff the most important thing? I'd say that the startup time of apps that *don't* use a lot of stuff is more important. I'm not sure that making the binary bigger doesn't slow it down.
I'd say that making more extensions static on Windows would increase start time of modules that don't use those extensions.
I guess I have to measure these things.
Yes, please. We switched to building almost all extensions as shared libs when we switched away from Modules/Setup to setup.py. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum <guido@python.org> writes:
But is the startup time of apps that use a lot of stuff the most important thing? I'd say that the startup time of apps that *don't* use a lot of stuff is more important. I'm not sure that making the binary bigger doesn't slow it down.
I'm pretty sure that it doesn't. On Unix, the system performs a copy-on-write mmap of the executable. No disk access is done until page faults trigger a disk read. I believe Windows uses a similar mechanism. The size of the executable is irrelevant (if you have no relocations); only the part of the executable that is used matters. On the other hand, on my Linux installation, importing a module costs 35 system calls if the module is not found, and no PYTHONPATH is set; every directory in PYTHONPATH adds four additional system calls.
Yes, please. We switched to building almost all extensions as shared libs when we switched away from Modules/Setup to setup.py.
For modules that require configuration, this was a good thing - now setup.py will autoconfigure them. For modules that require no additional libraries, I hope that this decision will be reverted some day. Regards, Martin
But is the startup time of apps that use a lot of stuff the most important thing? I'd say that the startup time of apps that *don't* use a lot of stuff is more important. I'm not sure that making the binary bigger doesn't slow it down.
I'm pretty sure that it doesn't. On Unix, the system performs a copy-on-write mmap of the executable. No disk access is done until page faults trigger a disk read. I believe Windows uses a similar mechanism. The size of the executable is irrelevant (if you have no relocations); only the part of the executable that is used matters.
On the other hand, on my Linux installation, importing a module costs 35 system calls if the module is not found, and no PYTHONPATH is set; every directory in PYTHONPATH adds four additional system calls.
Yes, please. We switched to building almost all extensions as shared libs when we switched away from Modules/Setup to setup.py.
For modules that require configuration, this was a good thing - now setup.py will autoconfigure them. For modules that require no additional libraries, I hope that this decision will be reverted some day.
If other people feel the same way, I won't stop progress here. But I find startup time a rather uninteresting detail, and everything else being the same I would personally prefer to keep the status quo: not because it's better, but because it's the status quo. Why churn? --Guido van Rossum (home page: http://www.python.org/~guido/)
[Neil Schemenauer]
A lot of people care about startup time. I would like to see a few more modules included statically.
If the real goal is to reduce startup time, we should analyze where startup time is being spent; random thrashing "in that general direction" won't satisfy in the end.
On 8 Aug 2002 at 17:51, Tim Peters wrote:
If the real goal is to reduce startup time, we should analyze where startup time is being spent; random thrashing "in that general direction" won't satisfy in the end.
In the 1.5.2 timeframe, most *startup* time was spent figuring out where to root sys.path (looking for the sentinel, deciding if this is a developer build, etc.). In crude experiments on my Linux box, I got rid of a few hundred system calls just by removing most of the intelligence from the getpath stuff. Then there are the things you can do with import (archives, careful crafting of sys.path), but that's harder to do, especially in a way that will satisfy most people / most scripts. So the lowest hanging fruit, I think, is to find some way of telling Python "don't be clever - just start here", and have it fallback to current behavior in the absence of that info. -- Gordon http://www.mcmillan-inc.com/
"Gordo" == Gordon McMillan <gmcm@hypernet.com> writes:
Gordo> In the 1.5.2 timeframe, most *startup* time was Gordo> spent figuring out where to root sys.path (looking Gordo> for the sentinel, deciding if this is a developer Gordo> build, etc.). In crude experiments on my Linux Gordo> box, I got rid of a few hundred system calls Gordo> just by removing most of the intelligence from Gordo> the getpath stuff. I remember doing some similar testing probably around the Python 2.0 timeframe and found a huge speed up by avoiding the import of site.py for largely the same reasons (avoiding tons of stat calls). It's not always practical to avoid loading site.py, but if you can, you can get a big startup win. Gordo> So the lowest hanging fruit, I think, is to find some Gordo> way of telling Python "don't be clever - just start Gordo> here", and have it fallback to current behavior in Gordo> the absence of that info. That's what $PYTHONHOME is supposed to do. It's been a while since I dug around in getpath.c, but setting $PYTHONHOME should set prefix and exec_prefix unconditionally, even in the build directory. (The comments in the file are abit little misleading. Step 1 could be read as implying that $PYTHONHOME isn't consulted when looking for build directory landmarks, but that's not the case: even for a build dir search, $PYTHONHOME is trusted unconditionally.) -Barry
I remember doing some similar testing probably around the Python 2.0 timeframe and found a huge speed up by avoiding the import of site.py for largely the same reasons (avoiding tons of stat calls). It's not always practical to avoid loading site.py, but if you can, you can get a big startup win.
It's also easy: "python -S" avoids loading site.py. --Guido van Rossum (home page: http://www.python.org/~guido/)
"GvR" == Guido van Rossum <guido@python.org> writes:
>> I remember doing some similar testing probably around the >> Python 2.0 timeframe and found a huge speed up by avoiding the >> import of site.py for largely the same reasons (avoiding tons >> of stat calls). It's not always practical to avoid loading >> site.py, but if you can, you can get a big startup win. GvR> It's also easy: "python -S" avoids loading site.py. Yes. The one gotcha is that site-packages is put on sys.path via site.py so using -S means you lose that directory. You can, of course, reinstall it explicitly by something like: import sys sitedir = os.path.join(sys.prefix, 'lib', 'python'+sys.version[:3], 'site-packages') sys.path.append(sitedir) -Barry
Martin v. Loewis wrote:
On the other hand, on my Linux installation, importing a module costs 35 system calls if the module is not found, and no PYTHONPATH is set; every directory in PYTHONPATH adds four additional system calls.
Why not address this problem instead ? Note that mxCGIPython can help a lot here: it freeze most of the Python std lib into the executable making imports go really fast (and that's needed if you're doing a lot of CGI scripting). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
"M.-A. Lemburg" <mal@lemburg.com> writes:
On the other hand, on my Linux installation, importing a module costs 35 system calls if the module is not found, and no PYTHONPATH is set; every directory in PYTHONPATH adds four additional system calls.
Why not address this problem instead ?
I'm trying to: every module incorporated in config.c won't be searched in PYTHONPATH.
Note that mxCGIPython can help a lot here: it freeze most of the Python std lib into the executable making imports go really fast (and that's needed if you're doing a lot of CGI scripting).
Indeed, freezing also helps - but is probably only suitable for special-purpose applications. I think people would be surprised if they are told that editing the source of a library module won't have any effect. Regards, Martin
Martin v. Loewis wrote:
"M.-A. Lemburg" <mal@lemburg.com> writes:
On the other hand, on my Linux installation, importing a module costs 35 system calls if the module is not found, and no PYTHONPATH is set; every directory in PYTHONPATH adds four additional system calls.
Why not address this problem instead ?
I'm trying to: every module incorporated in config.c won't be searched in PYTHONPATH.
Note that mxCGIPython can help a lot here: it freeze most of the Python std lib into the executable making imports go really fast (and that's needed if you're doing a lot of CGI scripting).
Indeed, freezing also helps - but is probably only suitable for special-purpose applications. I think people would be surprised if they are told that editing the source of a library module won't have any effect.
They shouldn't edit those anyway :-) What ever happened to the ZIP archive import that James C. Ahlstrom was working (I think it was him) ? If startup time for the std lib is considered a problem, then people could be directed to a ZIP archive incorporating the complete pure Python std lib. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
What ever happened to the ZIP archive import that James C. Ahlstrom was working (I think it was him) ?
python.org/sf/492105 is open for review. --Guido van Rossum (home page: http://www.python.org/~guido/)
[Guido]
I never have to do that; the dependencies in the project file make sure that the extensions are all built when you build the 'python' project.
[MvL]
Are you sure? If the python target is up-to-date (i.e. nothing has to be done for python_d.exe), and I delete all generated _sre files (i.e. sre_d.pyd, and the object files), and then ask VC++ 6 to build the python target, nothing is done.
Right, every project other than pythoncore and w9xpopen depends on the pythoncore project, but that's all. Guido doesn't normally change any code in any other subprojects, so he doesn't notice this viscerally. If you want to be completely safe at all times, do Build -> Batch Build. One step and easy. It won't recompile more than needed, although if the Python DLL changes, it will pee away a little time relinking things against the new core .lib file.
On 08 Aug 2002, Guido van Rossum <guido@python.org> wrote:
In addition, it should decrease startup time: Python won't need to locate that many files anymore.
It also decreases the total size of the binary distribution slightly.
Maybe _sre is used by most apps (though I doubt even that). But _socket, select, winreg, mmap and the others are definitely not. On Unix, all extensions are built as shared libraries, except the ones that are needed by setup.py to be able to build extensions; it looks like only posix, errno, _sre and symtable are built statically.
I'd say that making more extensions static on Windows would increase start time of modules that don't use those extensions.
_sre is used by any application that imports 'os'. That (IMHO) is almost every non-trivial Python program. Of course, we shouldn't be guessing about startup times. Someone should actually try building two versions and comparing them. -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
On Fri, Aug 9 2002 Duncan Booth wrote:
On 08 Aug 2002, Guido van Rossum <guido@python.org> wrote:
In addition, it should decrease startup time: Python won't need to locate that many files anymore.
It also decreases the total size of the binary distribution slightly.
Maybe _sre is used by most apps (though I doubt even that). But _socket, select, winreg, mmap and the others are definitely not. On Unix, all extensions are built as shared libraries, except the ones that are needed by setup.py to be able to build extensions; it looks like only posix, errno, _sre and symtable are built statically.
I'd say that making more extensions static on Windows would increase start time of modules that don't use those extensions.
_sre is used by any application that imports 'os'. That (IMHO) is almost every non-trivial Python program.
Not on my system it isn't! It's true that _sre does get imported whenever I start Python, but that is not because it gets imported by os. There is an import of re in posixpath (imported by os), but that is inside the function expandvars which is not called during import. In my case site.py imports distutils.util because Python decides it is called from the build directory. -- Sjoerd Mullender <sjoerd@acm.org>
Duncan Booth wrote: ...
_sre is used by any application that imports 'os'. That (IMHO) is almost every non-trivial Python program.
Sure? Then try this in a Windows shell: """ D:\>\python22\python hey this is sitepython Python 2.2.1 (#34, Apr 9 2002, 19:34:33) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
import sys for i in sys.modules: print i ... stat __future__ copy_reg os signal site __builtin__ UserDict sys sitecustomize ntpath __main__ exceptions types nt os.path
"""
As you can see, os is imported by the startup code, already. (Which I didn't know!) Furthermore, os didn't cause an import of _sre. ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
On 10 Aug 2002 at 20:38, Christian Tismer wrote:
Duncan Booth wrote: ...
_sre is used by any application that imports 'os'. That (IMHO) is almost every non-trivial Python program.
Sure? Then try this in a Windows shell:
Sjoerd Mullender already pointed out I got this wrong. Unfortunately, for reasons that currently escape me, my response disappeared into a black hole and didn't appear on the mailing list. I jumped to the wrong conclusion because running py2exe on a program that imports os always includes _sre.dll in the files for distribution. This is because the os module does indeed import _sre, but only when the function that uses it is actually called. So any program that imports os includes _sre in the automatically generated list of denpendencies, but it may or may not actually import it. -- Duncan Booth duncan@dales.rmplc.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? http://dales.rmplc.co.uk/Duncan
[Martin v. Lowis]
... Integrating _sre (and _socket, select, winreg, mmap, perhaps others) into python.dll still simplifies the build process: you don't have to right-click that many subprojects to build them.
If you're building via right-clicking, you're making life much harder than necessary. You can build from the command line, or do Build -> Batch Build -> Build in the GUI. The latter builds all projects in one gulp, including both Release and Debug versions (well, it actually displays a list of all possible project targets, and lets you select which to build in batch mode; this selection is persistent, so you only need to do it once).
On donderdag, augustus 8, 2002, at 07:16 , Martin v. Löwis wrote:
Guido van Rossum <guido@python.org> writes:
Let me turn this around. What advantage do you see to linking it statically?
The trigger was that it would have simplified the build for me: When converting VC++6 projects to VC.NET, VC.NET forgets to convert the /export: linker options, which means that you had to add them all manually. Mark has fixed this problem differently, by removing the need for /export:.
Integrating _sre (and _socket, select, winreg, mmap, perhaps others) into python.dll still simplifies the build process: you don't have to right-click that many subprojects to build them.
In addition, it should decrease startup time: Python won't need to locate that many files anymore.
It also decreases the total size of the binary distribution slightly.
Note that I went exactly the other way for MacPython over the last year. It used to be so that all "common" modules were included in PythonCore.slb, and I used separate project build files only for Mac-only modules and one or two special cases (Tk, expat). I bit the bullet half a year ago and made PythonCore.slb lean and mean, but still used my own private project build file generator for all extension projects. I bit the bullet again (actually, I bit one of the two remaining half-bullets, I've kept the Mac-specific modules as they are) last month, and MacPython now uses the main setup.py for a large collection of the cross-platform extension modules. This turned out to be only one or two evenings of work. This has immediately resulted in a decrease in my workload: whereas previously whenever someone decided to add the kaboozle module I had to add project files for this, etc etc etc, all that is now often taken care of by distutils and setup.py. -- - Jack Jansen <Jack.Jansen@oratrix.com> http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -
Jack Jansen <Jack.Jansen@oratrix.com> writes:
This has immediately resulted in a decrease in my workload: whereas previously whenever someone decided to add the kaboozle module I had to add project files for this, etc etc etc, all that is now often taken care of by distutils and setup.py.
Reducing the workload is a good thing, and so is sharing of build processes across many systems; I'm not proposing to give that up. At the moment, I'm really asking about Windows only; I'll ask about adding things back into Setup.dist when I can show what advantages that has. That does not mean that those things would be removed from setup.py - that is smart enough to build only things that haven't already been build. Regards, Martin
participants (14)
-
barry@zope.com
-
Christian Tismer
-
Duncan Booth
-
Gordon McMillan
-
Guido van Rossum
-
Jack Jansen
-
loewis@informatik.hu-berlin.de
-
M.-A. Lemburg
-
Martin v. Löwis
-
martin@v.loewis.de
-
Neil Schemenauer
-
Sjoerd Mullender
-
Tim Peters
-
Tim Peters