PEP 376 - from PyPM's point of view

Here are my comments regarding PEP 376 with respect to PyPM (the Python package manager being developd at ActiveState) Multiple versions: I understand that the PEP does not support installation (thus uninstallation) of multiple versions of the same package. Should this be explicitly mentioned in the PEP -- as `get_distribution` API accepts only `name` argument, and not a `version` argument?
get_distribution(name) -> Distribution or None. Scans all elements in sys.path and looks for all directories ending with .egg-info. Returns a Distribution corresponding to the .egg-info directory that contains a PKG-INFO that matches name for the name metadata. Notice that there should be at most one result. The first result founded is returned. If the directory is not found, returns None.
Some packages have package names with mixed case. Example: ConfigObj .. as registered in setup.py. However, other packages such as turbogears specifies "configobj" (lowercase) in their install_requires. Is `get_distribution(name)` supposed to handle mixed cases? Will it match both 'ConfigObj' and 'configobj'?
get_installed_files(local=False) -> iterator of (path, md5, size)
Will this also return the directories /created/ during the installation? For example, will it also contain the entry "docutils" .. along with "docutils/__init__.py"? If not, how is the installer (pip, pypm, etc..) supposed to know which directories to remove (docutils/) and which directories not to remove (site-packages/, bin/, etc..)?
The new version of PEP 345 (XXX work in progress) extends the Metadata standard and fullfills the requirements described in PEP 262, like the REQUIRES section.
Can you tell more about this? I see that PEP 262 allows both distributions names ('docutils') and modules/packages ('roman.py') in the 'Requires:' section. Is this how the new PEP is going to adhere to? Or, is it going to adhere to PEP 345's way of allowing *only* modules/packages? In practice, I noticed that packages usually specify distribution names in their 'Requires:' file (or install_requires.txt in the case of setuptools). Hence, PyPM *assumes* the install requirements to be distribution name. But then .. most distributions have the same name as their primary module/package. Ok, so PEP 345 also specifies the 'Provides:' header. Does easy_install/pip make use 'Provides:' at all when resolving dependencies? For example, does 'pip install sphinx' go look for all distributions that 'provides' the 'docutils' provision.. or does it simply get the distribution named 'docutils'? -srid

-- http://www.ironpythoninaction.com On 14 Jul 2009, at 01:12, "Sridhar Ratnakumar" <sridharr@activestate.com> wrote:
Here are my comments regarding PEP 376 with respect to PyPM (the Python package manager being developd at ActiveState)
Multiple versions: I understand that the PEP does not support installation (thus uninstallation) of multiple versions of the same package. Should this be explicitly mentioned in the PEP -- as `get_distribution` API accepts only `name` argument, and not a `version` argument?
get_distribution(name) -> Distribution or None. Scans all elements in sys.path and looks for all directories ending with .egg-info. Returns a Distribution corresponding to the .egg-info directory that contains a PKG-INFO that matches name for the name metadata. Notice that there should be at most one result. The first result founded is returned. If the directory is not found, returns None.
Some packages have package names with mixed case. Example: ConfigObj .. as registered in setup.py. However, other packages such as turbogears specifies "configobj" (lowercase) in their install_requires.
Is `get_distribution(name)` supposed to handle mixed cases? Will it match both 'ConfigObj' and 'configobj'?
An abomination for which I am truly sorry - however to be precise I'm pretty sure the setup.py specifies configobj and it is only registered on PyPI with mixed case (which I don't believe I can change). Michael
get_installed_files(local=False) -> iterator of (path, md5, size)
Will this also return the directories /created/ during the installation? For example, will it also contain the entry "docutils" .. along with "docutils/__init__.py"?
If not, how is the installer (pip, pypm, etc..) supposed to know which directories to remove (docutils/) and which directories not to remove (site-packages/, bin/, etc..)?
The new version of PEP 345 (XXX work in progress) extends the Metadata standard and fullfills the requirements described in PEP 262, like the REQUIRES section.
Can you tell more about this?
I see that PEP 262 allows both distributions names ('docutils') and modules/packages ('roman.py') in the 'Requires:' section. Is this how the new PEP is going to adhere to? Or, is it going to adhere to PEP 345's way of allowing *only* modules/packages?
In practice, I noticed that packages usually specify distribution names in their 'Requires:' file (or install_requires.txt in the case of setuptools). Hence, PyPM *assumes* the install requirements to be distribution name. But then .. most distributions have the same name as their primary module/package.
Ok, so PEP 345 also specifies the 'Provides:' header. Does easy_install/pip make use 'Provides:' at all when resolving dependencies? For example, does 'pip install sphinx' go look for all distributions that 'provides' the 'docutils' provision.. or does it simply get the distribution named 'docutils'?
-srid _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...

On Tue, Jul 14, 2009 at 2:12 AM, Sridhar Ratnakumar<sridharr@activestate.com> wrote:
Here are my comments regarding PEP 376 with respect to PyPM (the Python package manager being developd at ActiveState)
Multiple versions: I understand that the PEP does not support installation (thus uninstallation) of multiple versions of the same package. Should this be explicitly mentioned in the PEP -- as `get_distribution` API accepts only `name` argument, and not a `version` argument?
That's another can of worms ;) Before I answer here's a bit of background, i's a bit long but required, sorry For people that don't want to read the rest, here's the idea : multiple version support imho should be introduced later, if they are to be introduced, by extending PEP 302 protocol. The long explanantion now: given a "foo" package, containing a "bar" module, multiple versions support implies to do one of these: 1 - a custom PEP 302-like loader/importer that picks a version of "foo" when the code imports the "bar" module. this works if the "foo" package is not directly available in sys.path, and if the custom loader/importer is put in sys.meta_path for example. If "foo 0.9" is located in /var/packages/foo/0.9 and if "foo 1.0" is in /var/packages/foo/1.0, The loader will select the right foo package to load and return with through a loader that scans /var/packages/foo/* To make it work it requires 2 things : a/ a version comparison system (see PEP 386) that will make the loader pick the "latest" version by default b/ an API that will force the loader to pick one particular version 2 - changing the paths in sys.path to include the path containing the right version, and let the existing importer/loader do the work. That's what setuptools does with its multiple version system: an API called "require" will let you change sys.path on the fly >>> from pkg_resources import require >>> require('docutils==0.4') <--- looks for a docutils egg distribution and adds it in the path So if we support multiple versions in Python, (I'd love too). PEP 376 would need to be able to find the various versions of each distribution, not by scanning sys.path but rather by scanning a arbitrary collection of directories, then publishing the right ones in sys.path (with a PEP 302 loader, or ala setuptools by inserting them in sys.path) In other words this would require changing the way the distributions are stored. e.g. in self-contained eggs or in a brand-new storage tree. (I am currently experimenting this with "virtual site-packages" see http://bitbucket.org/tarek/vsp/src/tip/README.txt) But as we said earlier, people might want to store their modules anywhere (on a sql database, on Mars, etc.) and provide a PEP 302-like loader for them. PJE has "eggs" but John Doe might want to store its packages differently and provide an importer/loader for them. So each one of them provides a "package manager", which should composed of : A- a loader/importer system B- an installation system (that is easy_install -m for setuptools) C- query APIs D- a version comparison system E- an uninstaller So the real solution is to work with PEP 302 importers/loaders (A) (e.g. "package managers") Which means that PEP 302 need to be changed to become 'distribution-aware' as Paul said. Because we would then be able to browse distributions (C) that are not already loaded in sys.path, so work on two versions of the same distribution. but some open questions remains: It also implies that each package manager provides its installer (B) and a version comparison system (D) I'm not sure about the way package installers could be declared. Plus, how people would deal with several installers ? For the version comparison system I am not sure either, but it would require to have one global version comparison system to rule them all otherwise some conflicts may occur. So there's no plan to support multiple versions yet, because that requires another PEP imho. Since distutils is a package manager in some ways (it provides an installer, and stores distributions that are made available in sys.path) My feeling is that we need first to finish what's missing to make it fully usable (eg query apis + uninstaller) Then maybe think about extending PEP 302
get_distribution(name) -> Distribution or None. Scans all elements in sys.path and looks for all directories ending with .egg-info. Returns a Distribution corresponding to the .egg-info directory that contains a PKG-INFO that matches name for the name metadata. Notice that there should be at most one result. The first result founded is returned. If the directory is not found, returns None.
Some packages have package names with mixed case. Example: ConfigObj .. as registered in setup.py. However, other packages such as turbogears specifies "configobj" (lowercase) in their install_requires.
Is `get_distribution(name)` supposed to handle mixed cases? Will it match both 'ConfigObj' and 'configobj'?
As PJE said, we need normalization here yes. Right now PyPI is case insensitive for its index: http://pypi.python.org/simple/ConfigObj == http://pypi.python.org/simple/configobj But in the meantime, IIRC, the XML-RPC apis are case sensitive, and so the html browsing. easy_install is case insensitive though, because it uses the index. So we should be case-insensitive everywhere, so in PEP 376 too.
get_installed_files(local=False) -> iterator of (path, md5, size)
Will this also return the directories /created/ during the installation? For example, will it also contain the entry "docutils" .. along with "docutils/__init__.py"?
I don't think it's necessary to add "docutils" if "docutils/__init__.py" is present But for empty directories added during installation we should add the I guess. So, I'll add a note.
If not, how is the installer (pip, pypm, etc..) supposed to know which directories to remove (docutils/) and which directories not to remove (site-packages/, bin/, etc..)?
The new version of PEP 345 (XXX work in progress) extends the Metadata standard and fullfills the requirements described in PEP 262, like the REQUIRES section.
Can you tell more about this?
I see that PEP 262 allows both distributions names ('docutils') and modules/packages ('roman.py') in the 'Requires:' section. Is this how the new PEP is going to adhere to? Or, is it going to adhere to PEP 345's way of allowing *only* modules/packages?
The plan is to add what setuptools called "installed_requires", so you can tell which *distributions* should be installed, no matter if they are composed of a single module, or many packages.
In practice, I noticed that packages usually specify distribution names in their 'Requires:' file (or install_requires.txt in the case of setuptools). Hence, PyPM *assumes* the install requirements to be distribution name. But then .. most distributions have the same name as their primary module/package.
That's it yes: it will be distribution aware. If a module or package has the same name than the distribution name, it will make no difference.
Ok, so PEP 345 also specifies the 'Provides:' header. Does easy_install/pip make use 'Provides:' at all when resolving dependencies? For example, does 'pip install sphinx' go look for all distributions that 'provides' the 'docutils' provision.. or does it simply get the distribution named 'docutils'?
setuptools doesn't. I don't think pip does. btw: is PyPM a public project ? Regards Tarek -- Tarek Ziadé | http://ziade.org

2009/7/15 Tarek Ziadé <ziade.tarek@gmail.com>:
On Tue, Jul 14, 2009 at 2:12 AM, Sridhar Ratnakumar<sridharr@activestate.com> wrote:
Here are my comments regarding PEP 376 with respect to PyPM (the Python package manager being developd at ActiveState)
Multiple versions: I understand that the PEP does not support installation (thus uninstallation) of multiple versions of the same package. Should this be explicitly mentioned in the PEP -- as `get_distribution` API accepts only `name` argument, and not a `version` argument?
That's another can of worms ;)
:-)
Before I answer here's a bit of background, i's a bit long but required, sorry
For people that don't want to read the rest, here's the idea : multiple version support imho should be introduced later, if they are to be introduced, by extending PEP 302 protocol.
Disclaimer: I've only read the short version, so if some of this is covered in the longer explanation, I apologise now. -1. In my view, multiple version support is not at all related to PEP 302 - or to core Python in general. The import statement has no concept of versions, any version handling is done by explicit user manipulation of sys.path. PEP 302 is currently purely an import protocol. As such, it only cares about locating the correct code to run to populate sys.modules['foo']. Once the code has been located, there are a number of other details that might be useful, hence the extensions like get_data, get_filename, etc. But note that these are all *loader* extensions - their starting point is an imported module. The PEP 376 support I've just added is a *finder* extension, which is working alongside the scanning of the container - but rather than looking for modules, it's looking for distributions. Disappointingly (for me) it turned out not to give much opportunity to share code - the finder extensions could just as easily have been a completely independent protocol. PEP 376 support has added a requirement for 3 additional methods to the existing 1 finder method in PEP 302. That's already a 300% increase in complexity. I'm against adding any further complexity to PEP 302 - in all honesty, I'd rather move towards PEP 376 defining its *own* locator protocol and avoid any extra burden on PEP 302. I'm not sure implementers of PEP 302 importers will even provide the current PEP 376 extensions. I propose that before the current prototype is turned into a final (spec and) implementation, the PEP 302 extensions are extracted and documented as an independent protocol, purely part of PEP 376. (This *helps* implementers, as they can write support for, for example, eggs, without needing to modify the existing egg importer). I'll volunteer to do that work - but I won't start until the latest iteration of questions and discussions has settled down and PEP 376 has achieved a stable form with the known issues addressed. Of course, this is moving more and more towards saying that the design of setuptools, with its generic means for locating distributions, etc etc, is the right approach. We're reinventing the wheel here. But the problem is that too many people dislike setuptools as it stands for it to gain support. My understanding is that the current set of PEPs were intended to be a stripped down, more generally acceptable subset of setuptools. Let's keep them that way (and omit the complexities of multi-version support). If you want setuptools, you know where to get it... Paul.

Paul Moore wrote:
2009/7/15 Tarek Ziadé <ziade.tarek@gmail.com>:
On Tue, Jul 14, 2009 at 2:12 AM, Sridhar Ratnakumar<sridharr@activestate.com> wrote:
Here are my comments regarding PEP 376 with respect to PyPM (the Python package manager being developd at ActiveState)
Multiple versions: I understand that the PEP does not support installation (thus uninstallation) of multiple versions of the same package. Should this be explicitly mentioned in the PEP -- as `get_distribution` API accepts only `name` argument, and not a `version` argument?
That's another can of worms ;)
:-)
Before I answer here's a bit of background, i's a bit long but required, sorry
For people that don't want to read the rest, here's the idea : multiple version support imho should be introduced later, if they are to be introduced, by extending PEP 302 protocol.
Disclaimer: I've only read the short version, so if some of this is covered in the longer explanation, I apologise now.
-1.
I agree. People with versioning issues *should* be using virtualenv. Michael Foord
In my view, multiple version support is not at all related to PEP 302 - or to core Python in general. The import statement has no concept of versions, any version handling is done by explicit user manipulation of sys.path.
PEP 302 is currently purely an import protocol. As such, it only cares about locating the correct code to run to populate sys.modules['foo']. Once the code has been located, there are a number of other details that might be useful, hence the extensions like get_data, get_filename, etc. But note that these are all *loader* extensions - their starting point is an imported module.
The PEP 376 support I've just added is a *finder* extension, which is working alongside the scanning of the container - but rather than looking for modules, it's looking for distributions. Disappointingly (for me) it turned out not to give much opportunity to share code - the finder extensions could just as easily have been a completely independent protocol.
PEP 376 support has added a requirement for 3 additional methods to the existing 1 finder method in PEP 302. That's already a 300% increase in complexity. I'm against adding any further complexity to PEP 302 - in all honesty, I'd rather move towards PEP 376 defining its *own* locator protocol and avoid any extra burden on PEP 302. I'm not sure implementers of PEP 302 importers will even provide the current PEP 376 extensions.
I propose that before the current prototype is turned into a final (spec and) implementation, the PEP 302 extensions are extracted and documented as an independent protocol, purely part of PEP 376. (This *helps* implementers, as they can write support for, for example, eggs, without needing to modify the existing egg importer). I'll volunteer to do that work - but I won't start until the latest iteration of questions and discussions has settled down and PEP 376 has achieved a stable form with the known issues addressed.
Of course, this is moving more and more towards saying that the design of setuptools, with its generic means for locating distributions, etc etc, is the right approach. We're reinventing the wheel here. But the problem is that too many people dislike setuptools as it stands for it to gain support. My understanding is that the current set of PEPs were intended to be a stripped down, more generally acceptable subset of setuptools. Let's keep them that way (and omit the complexities of multi-version support).
If you want setuptools, you know where to get it...
Paul. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...

On Wed, Jul 15, 2009 at 12:17 PM, Michael Foord<fuzzyman@voidspace.org.uk> wrote:
Disclaimer: I've only read the short version, so if some of this is covered in the longer explanation, I apologise now.
-1.
I agree. People with versioning issues *should* be using virtualenv.
Let's remove site-packages from Python then.

2009/7/15 Tarek Ziadé <ziade.tarek@gmail.com>:
On Wed, Jul 15, 2009 at 12:17 PM, Michael Foord<fuzzyman@voidspace.org.uk> wrote:
Disclaimer: I've only read the short version, so if some of this is covered in the longer explanation, I apologise now.
-1.
I agree. People with versioning issues *should* be using virtualenv.
Let's remove site-packages from Python then.
If virtualenv/py2exe/cx_Freeze/py2app don't offer a solution, then maybe you're right. For me, py2exe (and a clean virtual machine if I require an absolutely pristine environment, I guess virtualenv would help here too) does what I need for application packaging. But I'll freely admit that my needs are minimal. Bluntly, as Python stands, import and sys.path do not offer any core support for multiple versions. Custom solutions can be built on top of that - that's what setuptools does. But they are precisely that - custom solutions, and should be supported as such, outside the core (and stdlib). If standard Python support for multi-version imports is required (it's not by me, but I accept that some people want it), then it should be designed in thoughout: - how do I import version 0.7.1 of package foo, rather than 0.7.2? - how do I use foo 0.8 at my interactive prompt, and import bar 0.2 which relies on foo 0.7.1? - what happens if I import foo 2.0 (which requires baz 0.1) and bar 1.5 (which requires baz 0.2)? - what does "import foo" (without a version number) mean? Is it different if it's at the command line or in bar 0.5 (which explicitly declares a dependency on foo 0.7 in its setup.py)? Does the answer to that mean that imports always need to read dependency information? - does your head hurt yet? I could go on... Any PEP dealing with multi versions should address these issues. It's a big area, and I have no interest in it myself, but I do have an interest in avoiding partial solutions which only look at some of the questions that might arise. Look - I really, really don't mind if people use setuptools. Honest. But I do mind if core python gets changed to support little bits of what setuptools does, adding complexity to deal with issues that setuptools handles, but without making it possible to avoid using setuptools. Where's the benefit to anyone then? Paul.

On Wed, Jul 15, 2009 at 5:14 PM, Paul Moore<p.f.moore@gmail.com> wrote:
2009/7/15 Tarek Ziadé <ziade.tarek@gmail.com>:
On Wed, Jul 15, 2009 at 12:17 PM, Michael Foord<fuzzyman@voidspace.org.uk> wrote:
Disclaimer: I've only read the short version, so if some of this is covered in the longer explanation, I apologise now.
-1.
I agree. People with versioning issues *should* be using virtualenv.
Let's remove site-packages from Python then.
If virtualenv/py2exe/cx_Freeze/py2app don't offer a solution, then maybe you're right.
They do offer a solution, but these solution are hard to maintain from a OS packager point of view. In any case I don't see any use case to have a "site-packages" remaining in Python itself.
If standard Python support for multi-version imports is required (it's not by me, but I accept that some people want it), then it should be designed in thoughout: [..]
Any PEP dealing with multi versions should address these issues. It's a big area, and I have no interest in it myself, but I do have an interest in avoiding partial solutions which only look at some of the questions that might arise.
Look - I really, really don't mind if people use setuptools. Honest. But I do mind if core python gets changed to support little bits of what setuptools does, adding complexity to deal with issues that setuptools handles, but without making it possible to avoid using setuptools. Where's the benefit to anyone then?
I totally agree. But I think that the boundary between what Python+stdlib should provide feature-wise and what third party packages provides is still fuzzy and should be made clearer. At some point, third party projects are trying hard to isolate dependencies because they can't do it with the execution environment Python sets by default when you launch a script or an interpreter. site.py loads site-packages and user site-packages at startup basically, and you can use PYTHONPATH / sys.path to add more, but that's partially shared by all apps. To address this issued, a project like zc.buildout will create a local repository of distributions, and will change sys.path on the fly so the local repository is used. virtualenv on its side creates an isolated Python installation for your application, So If the stdlib doesn't provide a standard protocol that goes further than that, people that have this need will continue to use zc.buildout or virtualenv and install many times the same libs on a system. It's perfectly fine from an application developer PoV, but it make site-packages obsoletes and every application installed that way looks like a black box for os packagers. They don't want that. At the end, application developers have to care about the way their applications dependencies are installed, were they should just give the list of those dependencies, and let any package manager project install them. If these matters are not adressed by Python sdtlib, then we should remove the loading of site-packages at Python startup completely, and make it cristal clear that it's not the core/stdlib problem. Otherwise we should think hard about how to let os packagers manage a single repository of Python distributions (and multiple versions) and how application developers could use them from within their applications. This protocol imho could be in the sdtlib even if the implementation is outside, but that's just me. In any case, these issues can be postponed after PEP 376, because that's a another (bigger) part of the puzzle. Regards Tarek

On Wed, 15 Jul 2009 at 16:14, Paul Moore wrote:
Bluntly, as Python stands, import and sys.path do not offer any core support for multiple versions. Custom solutions can be built on top of that - that's what setuptools does. But they are precisely that - custom solutions, and should be supported as such, outside the core (and stdlib).
If standard Python support for multi-version imports is required (it's not by me, but I accept that some people want it), then it should be designed in thoughout:
Isn't this problem directly analogous to the '.so version' (*) problem? Can we learn anything from the solution to that problem? Does not the fact that a (standard) solution to that problem was required indicate that a similar solution needs to be provided by core Python? (And yes, it's out of scope for PEP 376). --David (*) or, for our Windows users, DLL Hell and the Side By Side Component Sharing solution...

On Wed, Jul 15, 2009 at 12:10 PM, Paul Moore<p.f.moore@gmail.com> wrote:
Disclaimer: I've only read the short version, so if some of this is covered in the longer explanation, I apologise now.
Next time I won't put a short version ;)
PEP 376 support has added a requirement for 3 additional methods to the existing 1 finder method in PEP 302. That's already a 300% increase in complexity. I'm against adding any further complexity to PEP 302 - in all honesty, I'd rather move towards PEP 376 defining its *own* locator protocol and avoid any extra burden on PEP 302. I'm not sure implementers of PEP 302 importers will even provide the current PEP 376 extensions.
I propose that before the current prototype is turned into a final (spec and) implementation, the PEP 302 extensions are extracted and documented as an independent protocol, purely part of PEP 376. (This *helps* implementers, as they can write support for, for example, eggs, without needing to modify the existing egg importer). I'll volunteer to do that work - but I won't start until the latest iteration of questions and discussions has settled down and PEP 376 has achieved a stable form with the known issues addressed.
Sure that makes sense. I am all for having these 302 extensions flipped on PEP 376 side, then think about the "locator" protocol. I am lagging a bit in the discussions, I have 10 messages left to read or so, but the known issues I've listed so far are about the RECORD file and absolute paths, I am waiting for PJE example on the syntax he proposed for prefixes, on the docutils example.
Of course, this is moving more and more towards saying that the design of setuptools, with its generic means for locating distributions, etc etc, is the right approach. We're reinventing the wheel here. But the problem is that too many people dislike setuptools as it stands for it to gain support.
I don't think it's about setuptools design. I think it's more likely to be about the fact that there's no way in Python to install two different versions of the same distribution without "hiding" one from each other, using setuptools, virtualenv or zc.buildout. "installing" a distribution in Python means that its activated globally, whereas people need it locally at the application level.
My understanding is that the current set of PEPs were intended to be a stripped down, more generally acceptable subset of setuptools. Let's keep them that way (and omit the complexities of multi-version support).
If you want setuptools, you know where to get it...
Sure, but let's not forget that the multiple-version issue is a global issue OS packagers also meet. (setuptools is not the problem) : - application Foo uses docutils 0.4 and doesn't work with docutils 0.5 - application Bar uses docutils 0.5 if docutils 0.5 is installed, Foo is broken, unless docutils 0.4 is shipped with it. So right now application developers must use tools to isolate their dependencies from the rest of the system, using virtualenv or zc.buildout, and ship the dependencies with them. So they are sure that their applications are not broken by the system. This is not optimal of course, because you end up with several occurences of the same versions sometimes. And can be a nightmare for os packagers and maintainers. And virtualenv and such tools are now required when you develop applications (mid-size and large ones) and the "good practice" is to avoid any installation of any distributions in Python itself. So basically "site-packages" is a distribution location that is avoided by everyone because it doesn't know how to handle multiple versions. If we had a multi-versions support protocol, that would help os packagers and application developers to be friends again imho ;) Regards Tarek -- Tarek Ziadé | http://ziade.org

So basically "site-packages" is a distribution location that is avoided by everyone because it doesn't know how to handle multiple versions. I think you overrate the importance of having multiple versions of a
If we had a multi-versions support protocol, that would help os packagers and application developers to be friends again imho ;)
Let's remove site-packages from Python then. The _one_ site-packages folder stands for _one_ python interpreter. All
Tarek Ziadé wrote: package available for the same python interpreter. If you have m different versions of n packages then you could have n**m different combinations for an application so you need a possiblilty to select one combination from n**m possible ones at application startup time. Is this really worth it? the clever efforts to provide a set of package versions at runtime to an application (that uses the singleton python interpreter) do logically create a new python interpreter with a site-packages folder that contains just the versions of the packages the application needs, unfortunately by mucking with PYTHONPATH and <package>.pth, site.py etc making it very difficult to understand what is happening for the joe average python developer.

On Wed, Jul 15, 2009 at 5:16 PM, Joachim König<him@online.de> wrote:
Tarek Ziadé wrote:
So basically "site-packages" is a distribution location that is avoided by everyone because it doesn't know how to handle multiple versions.
I think you overrate the importance of having multiple versions of a package available for the same python interpreter. If you have m different versions of n packages then you could have n**m different combinations for an application so you need a possiblilty to select one combination from n**m possible ones at application startup time. Is this really worth it?
When you create an application that uses several libs, and when you create your distribution, you end up pinning a version for each one of your dependency to avoid any problems. While it's workable in a small application to list the dependencies in a text file, and let your users install them manually, it's impossible in bigger applications. Applications based on Zope for example have **hundreds** of dependencies you need to have installed, in very specific versions. If you install a Zope 2 app on your system, and a Zope 3 one, you have good chances of breaking them because the "zope.schema" distribution is incompatible. So either each application ships its own collection of dependencies and ignore site-packages (that's what zope-based applications does with zc.buildout), either they have a way to pick the right version of each package.
If we had a multi-versions support protocol, that would help os packagers and application developers to be friends again imho ;) Let's remove site-packages from Python then.
The _one_ site-packages folder stands for _one_ python interpreter. All the clever efforts to provide a set of package versions at runtime to an application (that uses the singleton python interpreter) do logically create a new python interpreter with a site-packages folder that contains just the versions of the packages the application needs, unfortunately by mucking with PYTHONPATH and <package>.pth, site.py etc making it very difficult to understand what is happening for the joe average python developer.
That's what we have nowadays. python packages installed in different places, and scripts tweaking the path to launch an application with them.

At 05:16 PM 7/15/2009 +0200, Joachim König wrote:
f you have m different versions of n packages then you could have n**m different combinations for an application so you need a possiblilty to select one combination from n**m possible ones at application startup time. Is this really worth it?
Obviously yes, as neither buildout nor setuptools would exist otherwise. ;-) Nor would Fedora be packaging certain library versions as eggs specifically to get certain multi-version scenarios to work. The specific solutions for handling n*m problems aren't fantastic, but they are clearly needed. (Buildout, btw, actually hardwires the n*m choice at install time, which is probably better for production situations than setuptools' dynamic approach.)

P.J. Eby wrote:
At 05:16 PM 7/15/2009 +0200, Joachim König wrote:
f you have m different versions of n packages then you could have n**m different combinations for an application so you need a possiblilty to select one combination from n**m possible ones at application startup time. Is this really worth it?
Obviously yes, as neither buildout nor setuptools would exist otherwise. ;-) Nor would Fedora be packaging certain library versions as eggs specifically to get certain multi-version scenarios to work.
The specific solutions for handling n*m problems aren't fantastic, but they are clearly needed. I still do not see the need.
IMO the whole obfuscation comes from fact that all versions of all packages are installed into one location where python automaticallly looks for packages and then with a lot of magic the packages are hidden from the interpreter and only specific requested versions are made "visible" to the interpreter at runtime. Why do the package have to be installed there at the first place? For an application it would be enough to have an additional directory on its PYTHONPATH where the packages required for this application would be installed. So a package could be installed either to the common directory ("site-packages") or an application specific directory (e.g. something like "app-packages/<appname>/"). This approach has been used by Zope2 with its "private" lib/python directory for years. So one would have to set up the application specific packages before running the application, but the whole clutter of uncounted versions of the same package in one directory could go away. The "drawback" of this approach would be, that the same version of a package would have to be installed multiple times if needed by different applications.

Joachim König wrote:
So one would have to set up the application specific packages before running the application, but the whole clutter of uncounted versions of the same package in one directory could go away. The "drawback" of this approach would be, that the same version of a package would have to be installed multiple times if needed by different applications.
While this is a very common practice in the Windows world, it is far less common in the *nix world of vendor managed packaging systems. As for why it can be a problem, it (bundling libraries with applications) makes security vulnerability management a *lot* more difficult for system administrators. If a bug is found in a key library (e.g. openssl) a dependency based system just needs to update the single shared copy of that library. In a bundling system, you first have to work out which of your applications contain an instance of that library and then see if the application vendors have provided a security patch. If any one of them hasn't released a patch and you can't patch it yourself, then you either have to stop using that application or else accept remaining exposed to the vulnerability. The bundling approach also leads to applications being much bigger than they need to be. That isn't much of a problem for desktop or server systems these days, but can still be an issue in the embedded world. As far as the idea of making bundling easier goes, we already implemented that in 2.6 and 3.0. It's the whole reason that zip files and directories are directly executable now: the named zip file or directory itself is automatically added to sys.path, so the top level "__main__.py" in that location can freely import any other co-located modules and packages. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Wed, Jul 15, 2009 at 11:00 PM, Tarek Ziadé<ziade.tarek@gmail.com> wrote:
On Wed, Jul 15, 2009 at 12:10 PM, Paul Moore<p.f.moore@gmail.com> wrote:
Disclaimer: I've only read the short version, so if some of this is covered in the longer explanation, I apologise now.
Next time I won't put a short version ;)
PEP 376 support has added a requirement for 3 additional methods to the existing 1 finder method in PEP 302. That's already a 300% increase in complexity. I'm against adding any further complexity to PEP 302 - in all honesty, I'd rather move towards PEP 376 defining its *own* locator protocol and avoid any extra burden on PEP 302. I'm not sure implementers of PEP 302 importers will even provide the current PEP 376 extensions.
I propose that before the current prototype is turned into a final (spec and) implementation, the PEP 302 extensions are extracted and documented as an independent protocol, purely part of PEP 376. (This *helps* implementers, as they can write support for, for example, eggs, without needing to modify the existing egg importer). I'll volunteer to do that work - but I won't start until the latest iteration of questions and discussions has settled down and PEP 376 has achieved a stable form with the known issues addressed.
Sure that makes sense. I am all for having these 302 extensions flipped on PEP 376 side, then think about the "locator" protocol.
I am lagging a bit in the discussions, I have 10 messages left to read or so, but the known issues I've listed so far are about the RECORD file and absolute paths, I am waiting for PJE example on the syntax he proposed for prefixes, on the docutils example.
Of course, this is moving more and more towards saying that the design of setuptools, with its generic means for locating distributions, etc etc, is the right approach. We're reinventing the wheel here. But the problem is that too many people dislike setuptools as it stands for it to gain support.
I don't think it's about setuptools design. I think it's more likely to be about the fact that there's no way in Python to install two different versions of the same distribution without "hiding" one from each other, using setuptools, virtualenv or zc.buildout.
"installing" a distribution in Python means that its activated globally, whereas people need it locally at the application level.
My understanding is that the current set of PEPs were intended to be a stripped down, more generally acceptable subset of setuptools. Let's keep them that way (and omit the complexities of multi-version support).
If you want setuptools, you know where to get it...
Sure, but let's not forget that the multiple-version issue is a global issue OS packagers also meet. (setuptools is not the problem) :
- application Foo uses docutils 0.4 and doesn't work with docutils 0.5 - application Bar uses docutils 0.5
if docutils 0.5 is installed, Foo is broken, unless docutils 0.4 is shipped with it.
As was stated by Debian packagers on the distutils ML, the problem is that docutils 0.5 breaks packages which work with docutils 0.4 in the first place. http://www.mail-archive.com/distutils-sig@python.org/msg05775.html And current hacks to work around lack of explicit version handling for module import is a maintenance burden: http://www.mail-archive.com/distutils-sig@python.org/msg05742.html setuptools has given the incentive to use versioning as a workaround for API/ABI compatibility. That's the core problem, and most problems brought by setuptools (sys.path and .pth hacks with the unreliability which ensued) are consequences of this. I don't see how virtualenv solves anything in that regard for deployment issues. I doubt using things like virtualenv will make OS packagers happy. David

2009/7/15 David Cournapeau <cournape@gmail.com>:
As was stated by Debian packagers on the distutils ML, the problem is that docutils 0.5 breaks packages which work with docutils 0.4 in the first place.
http://www.mail-archive.com/distutils-sig@python.org/msg05775.html
And current hacks to work around lack of explicit version handling for module import is a maintenance burden:
http://www.mail-archive.com/distutils-sig@python.org/msg05742.html
setuptools has given the incentive to use versioning as a workaround for API/ABI compatibility. That's the core problem, and most problems brought by setuptools (sys.path and .pth hacks with the unreliability which ensued) are consequences of this. I don't see how virtualenv solves anything in that regard for deployment issues. I doubt using things like virtualenv will make OS packagers happy.
So, I think what you're saying is: - The real issues is packages not maintaining backward compatibility (I agree) - Setuptools is a workaround (I agree, at least it's *one* workaround) - Virtualenv isn't a workaround (I don't know virtualenv, I'll take your word for it) Three points: - When building *applications*, bundling everything (py2exe-style) is an alternative workaround - universal on Windows, apparently uncommon on Unix/Linux. - When managing multiple packages in a "toolkit" style interactive Python installation, I'm not aware of a good solution (other than avoiding core that hits this issue in the first place). - I do not believe that it's clear that sanctioning the setuptools workaround as the "right" approach by building it into the Python core/stdlib is the right thing to do. Paul.

On Wed, 15 Jul 2009 08:22:03 -0700, David Cournapeau <cournape@gmail.com> wrote:
if docutils 0.5 is installed, Foo is broken, unless docutils 0.4 is shipped with it. As was stated by Debian packagers on the distutils ML, the problem is that docutils 0.5 breaks packages which work with docutils 0.4 in the first place.
Thus I am -1 on multi-version support .. and would rather have the python developers make their packages backward compatible just like what Armin did with Jinja and Jinja2. Debian at the moment has only one package "python-docutils" which is 0.5. How is a debian application supposed to depend upon 0.4? With Jinja, there is no problem .. there are 'python-jinja' and 'python-jinja2'. Similarly, CherryPy should have gone with renaming their package names to CherryPy2 and CherryPy3. -srid PS: Quoting from a fellow developer:
[...] it sounds like CherryPy 3.x is really an incompatible module that just kept the old name. That is rather unfortunate, but can sometimes happen. I have never seen a Python package changing its name (when significantly changing API, design, etc..). The only exception that comes to my mind is Jinja2 (renamed from 'Jinja'): [Armin] (...) Because we love backwards compatibility these changes will go into a package called "jinja2" <http://lucumr.pocoo.org/2008/4/13/jinja2-making-things-awesome>
Well, congrats to the Jinja team then! The others will eventually learn... Mixing incompatible APIs in a single namespace and using a non-standardized version numbering system to keep things apart creates a world of pain!
The challenge however is in compartmentalizing versions according to their major release numbers. Versioning in the Python world is already a mess which we are beginning to sort out: http://wiki.python.org/moin/Distutils/VersionComparison (PyPM relies on this standard and the ongoing implementation - verlib.py)
How embarrassing for a cult that prides itself on having only one way for everything they do... CPAN versions numbers are just as much if not more of a mess, but at least you can argue that it is the price for there being "more than one way to do it".

On Wed, 15 Jul 2009 02:01:24 -0700, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
get_installed_files(local=False) -> iterator of (path, md5, size)
Will this also return the directories /created/ during the installation? For example, will it also contain the entry "docutils" .. along with "docutils/__init__.py"?
I don't think it's necessary to add "docutils" if "docutils/__init__.py" is present
But for empty directories added during installation we should add the I guess.
So, I'll add a note.
It seems that you overlooked the below paragraph.
If not, how is the installer (pip, pypm, etc..) supposed to know which directories to remove (docutils/) and which directories not to remove (site-packages/, bin/, etc..)?
Quoting from the PEP: [quote]'(...)uninstall uses the APIs described earlier and remove all unique files, as long as their hash didn't change. Then it removes empty directories left behind.'[endquote] Let's assume that site-packages/ contained only one package 'Foo'. Will uninstall('Foo') remove the site-packages/ directory just because it turned out to be empty after removing 'Foo'? To explain, let's assume the RECORD of 'Foo' contains: $ cat RECORD Foo/__init__.py Foo/bar/__init__.py Foo/bar/test.py and according to what you wrote in the PEP ("it removes empty directories left behind"): $ python -m distutils.util.uninstall Foo rm /.../site-packages/Foo/__init__.py rm /.../site-packages/Foo/bar/__init__.py rm /.../site-packages/Foo/bar/test.py rm empty dir /.../site-packages/Foo/bar rm empty dir /.../site-packages/Foo/ rm empty dir /.../site-packages/ # !!!!! it also remove the site-packages directory! Then there is ~/python26/bin, ~/python26/include, ~/python26/etc, etc.. Do you see my point?
btw: is PyPM a public project ?
If by 'public', you meant open source, then no. -srid

2009/7/15 Sridhar Ratnakumar <SridharR@activestate.com>:
On Wed, 15 Jul 2009 02:01:24 -0700, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
get_installed_files(local=False) -> iterator of (path, md5, size)
Will this also return the directories /created/ during the installation? For example, will it also contain the entry "docutils" .. along with "docutils/__init__.py"?
I don't think it's necessary to add "docutils" if "docutils/__init__.py" is present
But for empty directories added during installation we should add the I guess.
So, I'll add a note.
It seems that you overlooked the below paragraph.
If not, how is the installer (pip, pypm, etc..) supposed to know which directories to remove (docutils/) and which directories not to remove (site-packages/, bin/, etc..)?
Quoting from the PEP:
[quote]'(...)uninstall uses the APIs described earlier and remove all unique files, as long as their hash didn't change. Then it removes empty directories left behind.'[endquote]
Let's assume that site-packages/ contained only one package 'Foo'. Will uninstall('Foo') remove the site-packages/ directory just because it turned out to be empty after removing 'Foo'? To explain, let's assume the RECORD of 'Foo' contains:
$ cat RECORD Foo/__init__.py Foo/bar/__init__.py Foo/bar/test.py
and according to what you wrote in the PEP ("it removes empty directories left behind"):
$ python -m distutils.util.uninstall Foo rm /.../site-packages/Foo/__init__.py rm /.../site-packages/Foo/bar/__init__.py rm /.../site-packages/Foo/bar/test.py rm empty dir /.../site-packages/Foo/bar rm empty dir /.../site-packages/Foo/ rm empty dir /.../site-packages/ # !!!!!
it also remove the site-packages directory!
Then there is ~/python26/bin, ~/python26/include, ~/python26/etc, etc.. Do you see my point?
I didn't mean that of course. While we can avoid your example for the code, by removing only packages that are fully emptied, and are alongside the egg-info directory, we might not be able to do it properly for the data. So let's add the directories as well
participants (11)
-
David Cournapeau
-
Joachim König
-
Michael Foord
-
Nick Coghlan
-
P.J. Eby
-
Paul Moore
-
R. David Murray
-
Sridhar Ratnakumar
-
Sridhar Ratnakumar
-
Tarek Ziadé
-
Terry Reedy