Questions about naming conventions.
The vast majority of packages when they install create in site-packages two directories with names like:
foobar foobar-1.2.3.dist-info (or egg-info)
However PyYAML creates:
and there is also this:
which is not associated with a versioned package.
import yaml import pkg_resources print(yaml.__version__)
Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: module 'pkg_resources' has no attribute '__version__'
So by what method could code working outside of python possibly determine that "yaml" goes with "PyYAML"? Is this a common situation?
Is pkg_resources actually a package? Does it make sense for a common package repository to have a single instance of this directory or should each installed python based program retain its own version of this?
There are some other files that live in site-packages which are not actually packages. The list so far is:
#some dynamic libraries, like kiwisolver.cpython-36m-x86_64-linux-gnu.so
#some pth files, but always so far with an explicit version number, like sphinxcontrib_applehelp-1.0.2-py3.8-nspkg.pth #or associated with a package with a version number like: setuptools setuptools-46.1.3.dist-info setuptools.pth
#some py files, apparently when that package does not make a corresponding #directory like: zipp-3.1.0.dist-info zipp.py
#initialization file "site" as site.py site.pyc
Any others to look out for? That is, files which might be installed in site-packages but which should not be shared.
Hopefully this next is an appropriate question for this list, since the issue arises from how python loads packages. Is there any way to avoid collisions between python based programs other than activating and deactivating their virtualenvs, or redefining PYTHONPATH, before each is used? Programs that have the property that their library loading is determinate (usually the case with C, fortran, bash scripts, etc.)one can construct a bash script (for instance) which runs 3 programs in order like so:
prog1 prog2 prog3 # spawns subprocesses which run prog2 and prog1
and there are not generally any issues. (Yes, one can create a mess with LD_PRELOAD and the like.) But if those 3 are python programs unless prog1, prog2, prog3 are all built into the same virtualenv, which usually means they come from the same software distribution, I don't see how to avoid conflicts for the first two cases without activating/deactivating each one, which looks like it might be tricky in the 3rd case.
If one has a directory like:
Other than using PYTHONPATH to direct to it with an absolute path, is there any way to force prog to only import from that specific site-packages? Let me try that again. Is there a way to tell prog via any environmental variable to look in "../lib/python3.6/site-packages" (and nowhere else) for imports, with the reference directory being that where prog is installed, not where the process PWD might happen to be. Because if that was possible it might allow a sort of "set it and forget it" method like
export PYTHONRELPATHFROMPROG="../lib/python3.6/site-packages prog1 #uses prog1 site-package prog2 #uses prog2 site-package prog3 #uses prog3 site-package # prog1 subprocess #uses prog1 site-package # prog2 subprocess #uses prog2 site-package
(None of which would be necessary if python programs could import specific versions reliably from a common directory containing multiple versions of each package.)
On Thu, Jun 25, 2020 at 10:46 AM David Mathog firstname.lastname@example.org wrote:
On Thu, Jun 25, 2020 at 12:37 AM Paul Moore email@example.com wrote:
I think the key message here is that you won't be *re*-inventing the wheel. This is a wheel that still needs to be invented.
It _was_ invented, but it is off round and gives a rough ride. As noted in the first post this:
__requires__ = ['scipy <1.3.0,>=1.2.0', 'anndata <0.6.20', 'loompy <3.0.0,>=2.00', 'h5py <2.10'] import pkg_resources
was able to load the desired set of package-versions for scanpy, but setting a version number constraint on scanpy itself at the end of that list, one which matched the version that the preceding commands successfully loaded, broke it. So it is not reliable.
And the entire __requires__ kludge is only present because for reasons beyond my pay grade this:
import pkg_resources pkg_resources.require("scipy<1.3.0,>=1.2.0;anndata<0.6.20;etc.") import scipy import anndata #etc.
cannot work because by default "import pkg_resources" keeps only the most recent version rather than making up a tree (or list or hash or whatever) and waiting to see if there are any version constraints to be applied at the time of actual package import.
What I'm doing now is basically duct tape and bailing wire to work around those deeper issues. In terms of language design, a much better fix would be to modify pkg_resources so that it will always successfully load the required versions from a designated directory which contains multiple versions of packages, and modify the package maintenance tools so that they can maintain such a directory.