[Distutils] pythonv, take two

Carl Meyer carl at oddbird.net
Fri Mar 18 19:44:23 CET 2011


Hi Vinay,

Thanks for the additional digging in here. I think your analysis is
right - it actually occurred to me yesterday that this could be the
problem, and I filed a bug to track it here:
https://bitbucket.org/carljm/cpythonv/issue/6/if-binary-is-copied-prefix-finding-could

The issue is finding the initial sys.prefix and thus the standard
library, which happens in the search_for_prefix function
Modules/getpath.c. The algorithm used is this (presuming no PYTHONHOME):

1. Step up the tree from the (symlinks-dereferenced) binary location
looking for anything that appears to be the standard library.

2. Fall back to the location hardcoded in the binary from the --prefix
option to ./configure.

With a symlink it's fully reliable, because the symlink is dereferenced
so the stdlib is found just as it normally would be.

With a copy of the binary, there are two situations where it can break:

A. The Python installation we are copying from has an incorrect
hardcoded prefix in the binary, and is relying on the tree-stepping to
find its standard library. In this case, the hardcoded prefix will be
used because the tree-stepping won't find anything. I think this is the
situation you are seeing: I think your binary is compiling with a prefix
of "/usr/local", and so it falls back on that prefix when it can't find
anything by tree-stepping up through /home. If you didn't actually have
a Python 3.3 installation in /usr/local, it would break more loudly due
to not finding any stdlib at all. I think you'll find that it works for
you if you compile with an explicit --prefix and then actually install
to that prefix, then copy that binary and make an env with it.

B. There is _another_ Python standard library (from the same version of
Python, on *nix - the Windows stdlib isn't in a version-specific
directory) up the hierarchy from the location of our virtualenv. In this
case, tree-stepping might find the wrong standard library and use it. I
think this case should be quite rare. It's most problematic on Windows,
but that's also where it shouldn't really happen, since Python is
installed self-contained and I can't imagine why you'd make a virtualenv
for one Python version _inside_ the installation directory of another
Python version. In Linux I guess somebody might try to make a
virtualenv, with a copied binary, somewhere in /usr/local, using some
Python _other_ than the one installed in /usr, but with the same
major/minor version. A very edge case, but possible.

So, possibilities I see for addressing this:

1. Decide that in real cases of real Python installations, it's so
unlikely to happen that we won't worry about it. I think it's possible
that this is acceptable; the biggest practical problem is likely to be
people trying to test this out during PEP review from a not-installed
checkout, just like you did. We'd have to be careful to instruct people
that it doesn't work that way, and might also want to add a check in the
env-creation script to verify that the created env works properly, and
if it doesn't give them some clue why not.

2. Decide that we just don't support copied binaries, only symlinked
ones. Apparently (I am Windows-ignorant) recent Windows versions do
support symlinks? So this might only involve dropping support for old
Windows'? How important is it for a new feature like this to fully
support all operating systems that Python supports? We could also not
expose the copy-binary option to the user, but fall back to it if we
have no symlinks; which ends up being option (1) but trying to narrow
even more the potential breakage cases.

3. The fully-reliable fix would be to somehow give the copied binary a
hint where to find the right standard library, and this would involve
adding something to the algorithm in getpath.c. The hint could take the
form of a key in the config file, but I'd really like to avoid fully
parsing the config-file in C and then again in Python later on. The hint
could also be some kind of specially-formatted comment line at the top
of the config file, which would require less C code to find and parse?

Any thoughts on this (or alternative solutions I haven't thought of) are
most welcome.

Carl


More information about the Distutils-SIG mailing list