[Distutils] Byte-compilation revisited

Greg Ward gward@python.net
Sat Sep 30 16:53:02 2000

Hi all --

based on and inspired by recent patches from Marc-Andre Lemburg and Rene 
Liebscher, I've finally started tackling the byte-compilation problem in 
earnest.  Here's the approach I'm taking:
  * new function 'byte_compile()' in distutils.util: this is the all-
    singing, all-dancing wrapper around py_compile that will do all
    the real work
  * reduce the 'bytecompile()' method in the install_lib command to
    a simple wrapper around 'util.byte_compile()', that does the Right
    Thing with respect to optimization and claimed source filename
    written to the .py{c,o} file
  * add similar functionality to the build_py command, so that you
    may optionally do byte-compilation at build time rather than
    install time.

The first two steps are done and checked in, except that install_lib's
'bytecompile()' method doesn't yet take advantage of the fancy features
in the new 'byte_compile()' -- it doesn't rewrite filenames or do

The default will continue to be doing compilation at install time rather
than build time.  I'm still leaning towards build-time compilation, but
it's too late in the Distutils 1.0 release cycle to change things like
this.  However, I want to have the *option* to do compilation at build
time, so people can experiment with it, see if it works, figure out what
other features are needed so it really works, etc.  The idea is that
developers could put settings in their setup.cfg that control when to do
byte-compilation; I suspect developers who want to distribute
closed-source modules will have to do build-time compilation.  Probably
the "install" command will need some sort of "don't install source"
option, or maybe the build command should have a "blow away source after
compiling it" option.

Here's my 'byte_compile()' function: as usual, it works for me.  Please
review it, and if you're following CVS, try it out.  (Should be enough
to install any module distribution containing pure Python modules.)

def byte_compile (py_files,
                  optimize=0, force=0,
                  prefix=None, base_dir=None,
                  verbose=1, dry_run=0,
    """Byte-compile a collection of Python source files to either
    .pyc or .pyo files in the same directory.  'optimize' must be
    one of the following:
      0 - don't optimize (generate .pyc)
      1 - normal optimization (like "python -O")
      2 - extra optimization (like "python -OO")
    If 'force' is true, all files are recompiled regardless of

    The source filename encoded in each bytecode file defaults to the
    filenames listed in 'py_files'; you can modify these with 'prefix' and
    'basedir'.  'prefix' is a string that will be stripped off of each
    source filename, and 'base_dir' is a directory name that will be
    prepended (after 'prefix' is stripped).  You can supply either or both
    (or neither) of 'prefix' and 'base_dir', as you wish.

    If 'verbose' is true, prints out a report of each file.  If 'dry_run'
    is true, doesn't actually do anything that would affect the filesystem.

    Byte-compilation is either done directly in this interpreter process
    with the standard py_compile module, or indirectly by writing a
    temporary script and executing it.  Normally, you should let
    'byte_compile()' figure out to use direct compilation or not (see
    the source for details).  The 'direct' flag is used by the script
    generated in indirect mode; unless you know what you're doing, leave
    it set to None.

    # First, if the caller didn't force us into direct or indirect mode,
    # figure out which mode we should be in.  We take a conservative
    # approach: choose direct mode *only* if the current interpreter is
    # in debug mode and optimize is 0.  If we're not in debug mode (-O
    # or -OO), we don't know which level of optimization this
    # interpreter is running with, so we can't do direct
    # byte-compilation and be certain that it's the right thing.  Thus,
    # always compile indirectly if the current interpreter is in either
    # optimize mode, or if either optimization level was requested by
    # the caller.
    if direct is None:
        direct = (__debug__ and optimize == 0)

    # "Indirect" byte-compilation: write a temporary script and then
    # run it with the appropriate flags.
    if not direct:
        from tempfile import mktemp
        script_name = mktemp(".py")
        if verbose:
            print "writing byte-compilation script '%s'" % script_name
        if not dry_run:
            script = open(script_name, "w")

from distutils.util import byte_compile
files = [
            script.write(string.join(map(repr, py_files), ",\n") + "]\n")
byte_compile(files, optimize=%s, force=%s,
             prefix=%s, base_dir=%s,
             verbose=%s, dry_run=0,
""" % (`optimize`, `force`, `prefix`, `base_dir`, `verbose`))


        cmd = [sys.executable, script_name]
        if optimize == 1:
            cmd.insert(1, "-O")
        elif optimize == 2:
            cmd.insert(1, "-OO")
        spawn(cmd, verbose=verbose, dry_run=dry_run)
    # "Direct" byte-compilation: use the py_compile module to compile
    # right here, right now.  Note that the script generated in indirect
    # mode simply calls 'byte_compile()' in direct mode, a weird sort of
    # cross-process recursion.  Hey, it works!
        from py_compile import compile

        for file in py_files:
            if file[-3:] != ".py":
                raise ValueError, \
                      "invalid filename: %s doesn't end with '.py'" % `file`

            # Terminology from the py_compile module:
            #   cfile - byte-compiled file
            #   dfile - purported source filename (same as 'file' by default)
            cfile = file + (__debug__ and "c" or "o")
            dfile = file
            if prefix:
                if file[:len(prefix)] != prefix:
                    raise ValueError, \
                          ("invalid prefix: filename %s doesn't start with %s"
                           % (`file`, `prefix`))
                dfile = dfile[len(prefix):]
            if base_dir:
                dfile = os.path.join(base_dir, dfile)

            cfile_base = os.path.basename(cfile)
            if direct:
                if force or newer(file, cfile):
                    if verbose:
                        print "byte-compiling %s to %s" % (file, cfile_base)
                    if not dry_run:
                        compile(file, cfile, dfile)
                    if verbose:
                        print "skipping byte-compilation of %s to %s" % \
                              (file, cfile_base)

Greg Ward                                      gward@python.net