generating pyc and pyo files
When should the .pyc and .pyo files be generated during the install process, in the "build" directory or the "install" one? I ask because I've been looking into GNU autoconf and automake. They compile emacs lisp code (.el->.elc) in the build dir before the actual installation, and it might be nice to follow their lead on that, and it would ensure that the downloaded python files are compileable before they are installed. OTOH, I know the normal Python install does a compileall after the .py files have been transfered to the install directory. Also, looking at buildall, it doesn't give any sort of error status on exit if a file could not be compiled. I would prefer the make process stop if that occurs, so compileall needs to exit with a non-zero value. Attached is a context diff patch to "compileall" from 1.5.1 which supports this ability. I can send the full file as well if someone wants it. Andrew dalke@bioreason.com *** /usr/local/lib/python1.5/compileall.py Mon Jul 20 12:33:01 1998 --- compileall.py Sun Mar 28 17:50:51 1999 *************** *** 34,39 **** --- 34,40 ---- print "Can't list", dir names = [] names.sort() + success = 1 for name in names: fullname = os.path.join(dir, name) if ddir: *************** *** 54,64 **** --- 55,67 ---- else: exc_type_name = sys.exc_type.__name__ print 'Sorry:', exc_type_name + ':', print sys.exc_value + success = 0 elif maxlevels > 0 and \ name != os.curdir and name != os.pardir and \ os.path.isdir(fullname) and \ not os.path.islink(fullname): compile_dir(fullname, maxlevels - 1, dfile) + return success def compile_path(skip_curdir=1, maxlevels=0): """Byte-compile all module on sys.path. *************** *** 69,79 **** maxlevels: max recursion level (default 0) """ for dir in sys.path: if (not dir or dir == os.curdir) and skip_curdir: print 'Skipping current directory' else: ! compile_dir(dir, maxlevels) def main(): """Script main program.""" --- 72,84 ---- maxlevels: max recursion level (default 0) """ + success = 1 for dir in sys.path: if (not dir or dir == os.curdir) and skip_curdir: print 'Skipping current directory' else: ! success = success and compile_dir(dir, maxlevels) ! return success def main(): """Script main program.""" *************** *** 98,109 **** sys.exit(2) try: if args: for dir in args: ! compile_dir(dir, maxlevels, ddir) else: ! compile_path() except KeyboardInterrupt: print "\n[interrupt]" if __name__ == '__main__': ! main() --- 103,118 ---- sys.exit(2) try: if args: + success = 1 for dir in args: ! success = success and compile_dir(dir, maxlevels, ddir) else: ! success = compile_path() except KeyboardInterrupt: print "\n[interrupt]" + return success + if __name__ == '__main__': ! if not main(): ! sys.exit(1)
Andrew Dalke wrote:
When should the .pyc and .pyo files be generated during the install process, in the "build" directory or the "install" one?
I ask because I've been looking into GNU autoconf and automake. They compile emacs lisp code (.el->.elc) in the build dir before the actual installation, and it might be nice to follow their lead on that, and it would ensure that the downloaded python files are compileable before they are installed.
OTOH, I know the normal Python install does a compileall after the .py files have been transfered to the install directory.
Compiling in the build area before installation means that you can install them with arbitrary privileges, owner, and group. That gets tricker using compileall.py. For example, let's say that "root" is doing an installation and the target should be owned by bin:bin. Can't do that with compileall.py, AFAIK. Cheers, -g -- Greg Stein, http://www.lyra.org/
Greg Ward says:
Compiling in the build area before installation means that you can install them with arbitrary privileges, owner, and group. That gets tricker using compileall.py.
For example, let's say that "root" is doing an installation and the target should be owned by bin:bin. Can't do that with compileall.py, AFAIK.
Sure, you cannot do that with compileall, but you can in the Makefile. After all, this is pretty much identical to setting the right permissions for compiled emacs files. I'm not sure I see the problem. The possibility I see is a Makefile like (though there is no way to tell compileall to compile a given list of files, it's just an example): MODULE = SpamSkit PYTHON_FILES = __init__.py spam.py eggs.py vikings.py PYTHON_CLEAN = $(PYTHON_FILES:.py.pyc) $(PYTHON_FILES:.py.pyo) PYTHON_INSTALL = $(PYTHON_FILES) $(PYTHON_CLEAN) PYTHON_SITE = /usr/local/lib/python1.5/site-packages INSTALL = /usr/bin/install -c INSTALL_DATA = ${INSTALL} -m 644 all: $(compileall) -f $(PYTHON_FILES) install: $(INSTALL_DATA) $(PYTHON_INSTALL) $(PYTHON_SITE)/$(MODULE) clean: rm -f $(PYTHON_CLEAN) so the .pyc and .pyo files would be installed with the same permissions as the .py files. And if you wanted to be more specific, you could tweak install (or the install-hook for automake) or INSTALL_DATA as needed. Ummm, after rereading your message I realized I don't understand it as well as I thought I did. It looks like you're suggesting to do the byte compiles locally before the install (which I'm now leaning towards) because you can modify the permissions better that way. If so, I disagree with that. With a standard unix user account I cannot modify all the permissions bits (like setuid) or the owner/group. I need to be root for that, and normally that only happens before the "make install". The problem there is that some (many?) places don't NFS export with root write privs to their clients. My home directory is not writeable by root except on the file server. So if I am root and root tries to modify the permissions before the copy, it will fail, while if it copies first then modifies the permissions, it will work. So the options I see are: 1) copy .py files to the install directory run compileall on that directory (or the equivalent) change permissions on the files in the install directory as needed 2) run compileall in the build directory copy the .py{,c,o} files to the install directory change permissions on the files in the install directory as needed and #2 is my lean-to, as it were. It seems that having a "compileall" which takes a list of files rather than directories will be useful. Otherwise for option (1) installing a single file (not module) to /usr/local/lib/python1.5/site-packages (or site-python) and then doing compileall on that directory may cause all modules to be (re)compiled. For option (2) you will run into problems with test/development .py files in the developer's build directory which aren't valid python files and hence will cause compileall to file. This isn't a direct problem for distutils since we can assume that all .py files will be installed, but it is important to bear in mind. Oh! Plus, what's the policy for installing python files in some place like /usr/local/bin? Must all such scripts be compiled during the distutils install process? If not, then compileall on /usr/local/bin could cause problems. How about added an option, like "-f", to compileall to support a list of files which should be compiled? Andrew dalke@bioreason.com
Andrew Dalke wrote:
Greg Ward says:
Actually, it was "Greg Stein" :-)
Compiling in the build area before installation means that you can install them with arbitrary privileges, owner, and group. That gets tricker using compileall.py.
For example, let's say that "root" is doing an installation and the target should be owned by bin:bin. Can't do that with compileall.py, AFAIK.
Sure, you cannot do that with compileall, but you can in the Makefile. After all, this is pretty much identical to setting the right permissions for compiled emacs files.
That was my point. Nothing complicated or devious. Simply that if you do the compile into the local "build" area, then you can use "install" to copy them to the install area with the right permissions (and if you're root, with the right owner/group). You asked which would be best. I suggested "build" area with the above point as a reason. Cheers, -g -- Greg Stein, http://www.lyra.org/
Andrew Dalke writes:
When should the .pyc and .pyo files be generated during the install process, in the "build" directory or the "install" one? ... OTOH, I know the normal Python install does a compileall after the .py files have been transfered to the install directory.
I think the structure of the Python build process may be due to an older behavior in Python which I think has been fixed. Originally, the __file__ name in a module was initialized at compile time, not at import time. I think it is now set in the .pyc/.pyo at compile time and re-set in the module at import time. If you load the .pyc/.pyo without going through the import machinery you should see the name of the file as it was accessed at compile time (which might be relative). In general, the compile-time __file__ value may be invalid in the importing process. I think the .pyc/.pyo files can be built in the work area and then installed using the normal file installation mechanisms. This provides a little more flexibility in the installation machinery as well.
Also, looking at buildall, it doesn't give any sort of error status on exit if a file could not be compiled. I would prefer the make process stop if that occurs, so compileall needs to exit with a non-zero value. Attached is a context diff patch to "compileall" from 1.5.1 which supports this ability. I can send
I will integrate this patch; thanks! -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives
Fred Drake pointed out:
I think the .pyc/.pyo files can be built in the work area and then installed using the normal file installation mechanisms. This provides a little more flexibility in the installation machinery as well.
Thanks for the pointer on __file__ changes. I'll verify that things work as part of my testing. Andrew
Quoth Andrew Dalke, on 28 March 1999:
When should the .pyc and .pyo files be generated during the install process, in the "build" directory or the "install" one?
Sounds like everyone is in favour of compiling at build time: good. Nobody mentioned my reason for favouring this, which is simple: installation should consist of nothing more than copying files and (possibly) changing modes and ownerships. All files that will be installed should be generated at build time. This makes lots of things easier, notably: installation itself; updating the mythical database of installed files; and creating "built distributions" such as RPM. Also, if you check the code, you'll note that I don't use the 'compileall' module, but rather explicitly follow the list of module to build. Being able to catch errors didn't occur to me, but it's one good reason. (And Andrew's patch probably won't make it into versions 1.4 through 1.5.1, which I would still like to support.) I think I just did it that way because I don't like ceding control over which files are processed to an external entity. (You'll note that distutils supplies it's own 'copy_tree()' function, for basically the same reason.) Would anyone interested in error handling care to look into what happens when 'compile' fails? Doesn't look like I've done anything in particular to handle it (see distutils/command/build_py.py, towards the bottom of the 'run()' method) -- I probably blithely assumed that it would raise an exception like most IO routines do. Wow, a thread where everybody agrees... we must be on to to something. Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 x287 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913
I just did a scan of Greg's "build_py" which caused me to recheck compileall. Is it true that the only way to generate .pyo files is to rerun python with -O? Looks like things are that way, so I'll need to change things in my Makefiles to generate both sets of compiled files.
Wow, a thread where everybody agrees... we must be on to to something.
Umm... Umm... I disagree with that ! Phew, the universe has regained some stability :) Andrew
Andrew Dalke writes:
Is it true that the only way to generate .pyo files is to rerun python with -O? Looks like things are that way, so
There's a global variable in the C code that can be set to enable optimization. When I spoke to Guido about exposing it in the parser module, he objected. His rationale was that the setting would probably change from version to version and so should not be exposed. (My response was that the grammar changed with major revisions anyway, so the parser module already tends to get affected with some regularity.) I'd be happy to expose somehow in the parser module, but I don't know that Guido won't throw out the change. ;-) For now, the best way to compile whichever flavor your process doesn't generate is to use a child process with -O set if __debug__ is true. -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives
Quoth Andrew Dalke, on 28 March 1999:
When should the .pyc and .pyo files be generated during the install process, in the "build" directory or the "install" one?
Sounds like everyone is in favour of compiling at build time: good. Nobody mentioned my reason for favouring this, which is simple: installation should consist of nothing more than copying files and (possibly) changing modes and ownerships. All files that will be installed should be generated at build time. This makes lots of things easier, notably: installation itself; updating the mythical database of installed files; and creating "built distributions" such as RPM.
I disagree. I see no reason to double the size of the distribution by shipping redundent files (on average, a .pyc is 78.786% of a .py, based on the Python 1.5.1 distribution; .pyo is 88.148%). Permissions and ownerships cannot be handled at build time. And as the other thread (about placement of distribution files) has stated, it is something the sys admins and installers will have to handle - and can override. If the person is not the sys admin, s/he will have to talk to the the sys admin to open global areas, or deal with creating their own areas. Making pyc/pyo files is trivial and unnecessary (for shipping). There is already an installation step; it's nothing to add a compileall and chownall/chmodall step too. [snip]
Wow, a thread where everybody agrees... we must be on to to something.
Not quite, I'm just getting my new house straightened out is all.
Greg
-Arcege -- ------------------------------------------------------------------------ | Michael P. Reilly, Release Engineer | Email: arcege@shore.net | | Salem, Mass. USA 01970 | | ------------------------------------------------------------------------
Michael P. Reilly <arcege@shore.net> said:
I disagree. I see no reason to double the size of the distribution by shipping redundent files (on average, a .pyc is 78.786% of a .py, based on the Python 1.5.1 distribution; .pyo is 88.148%).
I believe you misread the intention. Only the .py files will be shipped. Once downloaded they are unpacked into the "build" directory. The .pyo and .pyc files are generated in the build directory on the local (downloaded) machine. Once these files are compiled locally, they are installed into the install directory.
Permissions and ownerships cannot be handled at build time.
Correct. And that's why they will be handled during the install step. Andrew dalke@bioreason.com
Quoth Andrew Dalke, on 29 March 1999:
Michael P. Reilly <arcege@shore.net> said:
I disagree. I see no reason to double the size of the distribution by shipping redundent files (on average, a .pyc is 78.786% of a .py, based on the Python 1.5.1 distribution; .pyo is 88.148%).
I believe you misread the intention. Only the .py files will be shipped. Once downloaded they are unpacked into the "build" directory. The .pyo and .pyc files are generated in the build directory on the local (downloaded) machine. Once these files are compiled locally, they are installed into the install directory.
Well, actually, you're both right. Compiling .py files at build time will not affect *source* distributions, which is what Andrew is talking about. But it *will* affect the size of *built* distributions, which is what Michael is talking about (I assume). The whole reason I've been calling them "built distributions" instead of "binary distributions" is because of the presumed inclusion of .pyc/.pyo files. I'll have to play around a bit to see how much including .pyc's in the built distributions affects the final size of the .tar.gz or .zip (or .rpm, or whatever) file. Let's see, tarring and zipping up the current Distutils 'build' directory with .pyc files looks like this: -rw-r--r-- 1 gward staff 41363 Mar 30 08:12 distutils-bdist-1.tar.gz -rw-r--r-- 1 gward staff 51629 Mar 30 08:13 distutils-bdist-1.zip and if I delete the .pyc's and try again: -rw-r--r-- 1 gward staff 20544 Mar 30 08:13 distutils-bdist-2.tar.gz -rw-r--r-- 1 gward staff 25447 Mar 30 08:13 distutils-bdist-2.zip So! Michael was almost exactly right, including the .pyc's really does double the size of the built distribution. .pyc files compress roughly as well as .py files. Does this seem like a problem to anyone else? I still want to keep installation as simple as possible -- and, more importantly, be able to trivially determine the set of files that will be installed -- but if increasing the size of built distributions really bothers you, speak up! Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 x287 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913
Michael P. Reilly <arcege@shore.net> said:
I disagree. I see no reason to double the size of the distribution by shipping redundent files (on average, a .pyc is 78.786% of a .py, based on the Python 1.5.1 distribution; .pyo is 88.148%).
Andrew Dalke, on 29 March 1999, writes:
"build" directory. The .pyo and .pyc files are generated in the build directory on the local (downloaded) machine.
Greg Ward writes:
Well, actually, you're both right. Compiling .py files at build time will not affect *source* distributions, which is what Andrew is talking
And I say Michael's figures are conservative; most .pyc files are larger than the .py files. (I've attached a simple script to compare the sizes; Unix only.)
Does this seem like a problem to anyone else? I still want to keep installation as simple as possible -- and, more importantly, be able to trivially determine the set of files that will be installed -- but if increasing the size of built distributions really bothers you, speak up!
I think a lot of people will be bothered by the increased size, whether or not we are. People with poor connectivity and archive maintainers will want reduced size. Removing the .pyc and .pyo files will help make packages less tied to Python versions as well; these files have often been obsoleted by changes between Python versions. It's simple enough to generate them during installation when we're able to run at that time. For RPMs this should be fine; I'm not sure about other package systems. For some, it may make sense to build them in the installation locations and then chmod them; this may be the case for the Solaris PKG system. (Barry, are you following this?) -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives
Michael P. Reilly <arcege@shore.net> said:
I disagree. I see no reason to double the size of the distribution by shipping redundent files (on average, a .pyc is 78.786% of a .py, based on the Python 1.5.1 distribution; .pyo is 88.148%).
Andrew Dalke, on 29 March 1999, writes:
"build" directory. The .pyo and .pyc files are generated in the build directory on the local (downloaded) machine.
Greg Ward writes:
Well, actually, you're both right. Compiling .py files at build time will not affect *source* distributions, which is what Andrew is talking
And I say Michael's figures are conservative; most .pyc files are larger than the .py files. (I've attached a simple script to compare the sizes; Unix only.)
They aren't just conservative, they are downright misleading - I mixed the ratios in the email, sorry. The 78.768% is supposed to be the size of .py to .pyc and 88.148% is .py to .pyo, not the other way around. These were averages based on the modules available thru sys.path at home (318 .pyc files, 213 .pyo files); it only included values where there were both a .py and .pyc or both .py/.pyo so the averages weren't thrown off.
-Fred
-Arcege -- ------------------------------------------------------------------------ | Michael P. Reilly, Release Engineer | Email: arcege@shore.net | | Salem, Mass. USA 01970 | | ------------------------------------------------------------------------
Michael P. Reilly writes:
They aren't just conservative, they are downright misleading - I mixed the ratios in the email, sorry. The 78.768% is supposed to be the size of .py to .pyc and 88.148% is .py to .pyo, not the other way around.
That makes sense.
These were averages based on the modules available thru sys.path at home (318 .pyc files, 213 .pyo files); it only included values where there were both a .py and .pyc or both .py/.pyo so the averages weren't
That's the right approach. Sounds like I should add more summarization to my checkpycs script, to get the ratios out for each .pyc/.pyo and in summary. -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives
Grep Ward said:
Compiling .py files at build time will not affect *source* distributions, which is what Andrew is talking about. But it *will* affect the size of *built* distributions, which is what Michael is talking about (I assume). The whole reason I've been calling them "built distributions" instead of "binary distributions" is because of the presumed inclusion of .pyc/.pyo files.
You can tell I'm used to working from source :) As mentioned before, I'm looking into the autoconf/automake process. They have a concept of DISTDIR which is prefixed to the install path as in $(DESTDIR)$(bindir) (the default value of DESTDIR is ""). In theory, I can do a make install DESTDIR="blib/" and it will install my package underneath: blib/usr/local/lib/python1.5/site-packages/kwyjibo/... and my executable scripts under blib/usr/local/bin/melissa Wouldn't it be possible to have distribution program figure out what to do based on this tree, including knowing to make .pyo and .pyc files from .py files, if they exist? Of course, permissions are problematical here as well. The package generation program could have a hook which lets the distributor add some python code to adjust things accordingly. Also, I could use a special INSTALL program which doesn't actually set the permissions but instead logs them to a file for the packager (hook) to use. Andrew dalke@bioreason.com
participants (5)
-
Andrew Dalke
-
Fred L. Drake
-
Greg Stein
-
Greg Ward
-
Michael P. Reilly