Add a site.cfg to keep a persistent list of paths

Hello There's one feature I want to add in distutils2: the develop command setuptools provides. Basically it adds a "link" file into site-packages, and does some magic at startup to load the path that is contained in the link file. The use case is to be able to have a project added in the python path without installing it. I am not a huge fan of adding files in site-packages for this though, and the magic it supposes. I thought of another mechanism: a persistent list of paths site.py would load. So the idea is to have two files: - a site.cfg at the python level, with a persistent list of paths - a .local/site.cfg at the user level for user-defined paths. Then distutils2 would add/remove paths in these files in its develop command. This file could contain paths and also possibly sitedirs. Does this sound crazy ? Tarek -- Tarek Ziadé | http://ziade.org

On Tue, Oct 19, 2010 at 4:26 PM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
The link file is a red herring -- setuptools adds an entry to easy-install.pth that points to the directory. It would work equally as well to add a .pth file for the specific package (though .pth files append to the path, so if you already have a package installed and then a .pth file pointing to a development version, then it won't work as expected, hence the magic in easy-install.pth). -- Ian Bicking | http://blog.ianbicking.org

On Wed, Oct 20, 2010 at 12:03 AM, Ian Bicking <ianb@colorstudy.com> wrote:
Yes, or a develop.pth file containing those paths, like Carl proposed on IRC. a .cfg is not really helping indeed. But we would need to have the metadata built and stored somewhere. A specific directory maybe for them.
-- Ian Bicking | http://blog.ianbicking.org
-- Tarek Ziadé | http://ziade.org

On 19 October 2010 22:26, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
Can you explain the requirement in more detail? I don't use the setuptools develop command, so I don't have the background, but it seems to me that what you're proposing can be done simply by adding the relevant directory to PYTHONPATH. That's all I ever do when developing (but my needs are pretty simple, so there may well be subtle problems with that approach). Paul

On Wed, Oct 20, 2010 at 11:57 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Sorry that was vague indeed. It goes a little bit farther than than: the project packages and modules have to be found in the path, but we also need to publish the project metadata that would be installed in a normal installation, so our browsing/query APIs can find the project. So, if a project 'Boo' has two packages 'foo' and 'bar' and a module 'baz.py', we need those in the path but also the Boo.dist-info directory that is created at installation time (see PEP 376). Setuptools' metadata directory is called Boo.egg-info, and distutils 1 has a file called Boo.egg-info since python 2.5 And since a python project can publish several top level directories, all of them needs to be added in the path. so adding the current dir to PYTHONPATH will not work in every case even if the metadata are built and dropped there. I am not sure what would be the best way to handle this, maybe having these metadata built in place, then listing all the paths that need to be included and write them to a .pth file Distutils2 manage. So: 0. have a distutils2.pth file installed with distutils2 Then, to add the project in the path: 1. build the project metadata in-place 2. get the project paths by listing its packages and directories (by invoking a pseudo-install command) 3. inject these paths in distutils2.pth To remove it: 1. get the project paths by listing its packages and directories 2. remove these paths from distutils2.pth Another problem I see is that any module or package that is not listed by the project and that would not be installed in the site-packages might be added in the path, but that's probably not a huge issue. The goal is to be able to avoid re-installing a project you are working on to try it, every time you make a change. This is used a lot, and in particular with virtualenv. So in any case, it turns out .pth files are a good way to do this so I guess this thread does not belong to python-ideas anymore. Cross-posting to the D2 Mailing list to move it there ! Tarek -- Tarek Ziadé | http://ziade.org

On 20 October 2010 14:36, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
Maybe I'm still missing something, but are you saying that the metadata query APIs don't respect PYTHONPATH? Is there are reason why they can't?
... and I'd expect the dist-info directory to be located by searching PYTHONPATH
So, project Foo publishes packages bar and baz. MyDir Foo __init__.py bar __init__.py baz __init__.py Foo-N.M-pyx.y.dist-info (Is that right? I'm rusty on the structure. That's how it looks in Python 2.7) So the directory MyDir is on PYTHONPATH. Then Foo.bar and Foo.baz are visible, and the dist-info file is on PYTHONPATH for introspection. If you're saying that Foo *isn't* a package itself, so Foo/__init__.py doesn't exist, and bar and baz should be visible unqualified, then I begin to see your issue (although my first reaction is to say "don't do that, then" :-)). But don't you then just need to search *parents* of elements of PYTHONPATH as well for the metadata search? If that's an issue then doesn't that mean you've got other problems with how people structure their directories? Actually, I suspect my picture above is wrong, as I can't honestly see that mandating that the dist-info file be a *sibling* (in an arbitrarily cluttered directory) of the project directory, is sensible... But I'm probably not seeing the real issues here. All I would say is, don't let the needs of more unusual configurations over-complicate basic usage. Paul.

On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore <p.f.moore@gmail.com> wrote: ...
yeah that the main issue: we can't make assumptions on how the source tree looks in the project, so adding the root path will not work all the time. Some people even have two separate root packages. Which is not a good layout, but allowed.. In Zope, I think the convention is to use a src/ directory so that's another level. Since distutils1 and distutils2 will let you provide in their options a list of packages and modules, I think it's the only sane way to get a list of paths we can then add in the path.
The trouble is: adding in PYTHONPATH the root of the source of your project can be different from what it would be once installed in Python. Now the question is: if 90% of the projects out there would work by adding the root, then this is might be overkill. I am afraid it's way less though... Tarek -- Tarek Ziadé | http://ziade.org

On Wed, Oct 20, 2010 at 9:27 AM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
Setuptools puts the files in the src/ directory in that case. More complicated layouts simply aren't supported, and generally no one complains as more complicated layouts are uncommon and a sign someone's head is somewhere very different than where they would be if they were using setup.py develop. -- Ian Bicking | http://blog.ianbicking.org

[sorry, forgot to include the list address before] Hi On 20 October 2010 15:27, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
I've read your and Ian's responses and still don't understand what setup.py develop brings to the party which can't be done with simple PYTHONPATH. Excuse me if I also completely misunderstand what develop does but it sounds like it's going to add an in-development version of a project on a users's sys.path (at the front?) until it's undone again somehow (is there a "setup.py undevelop"?). This just seems dangerous to me since it will affect all python programs run by that user. If I understand correctly this whole "develop" dance is for when you have two inter-depended packages in development at the same time. If manually setting PYTHONPATH correctly in this situation is too complicated then my feeling is there's nothing wrong with some sort of helper which manipulates PYTHONPATH for you, something like spaw a new shell and set the environment in that correctly. But placing things in files makes this permanent for the user and just seems the wrong way to go to me. Again, apologies if I understand the problem wrongly. But I too am worried about too many complexities and "magic". One of my main issues with setuptools is that it tries to handle my python environment (sys.path) outside of normally expected python mechanisms by modifying various custom files. I would hate to see distutils2 repeat this. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org

On 21 October 2010 00:35, Floris Bruynooghe <flub@devork.be> wrote:
I'm glad it's not just me!
I think that the key issue here is that PEP 376 introduces the idea of a "distribution" which is a somewhat vaguely defined concept, which can contain one or more packages or modules. Distributions don't have a well-defined directory structure, and don't participate properly in Python's standard import mechanism (PEP 302, PYTHONPATH, all that stuff). The distribution metadata (dist-info directory) is not package-based, and so doesn't fit the model. Suggestions: 1. PEP 376 clearly defines what a "distribution" (installed or otherwise) is, in terms of directory structure, whether/how it supports PEP302-style non-filesystem access, etc. I don't see a reason here why we can't mandate some structure, rather than leaving things as a "free for all" like the current setuptools/adhoc approach. 2. Mechanisms for dealing with distributions are *only* discussed in terms of the PEP 376 definitions, so we have a common understanding. As a first cut, I'd say that a distribution is defined purely in terms of its metadata (dist-info directory). On that basis, there should be a definition of where dist-info directories are searched for, PEP 376 seems to state that this is only in site-packages ("This PEP proposes an installation format inspired by one of the options in the EggFormats standard, the one that uses a distinct directory located in the site-packages directory."). And yet, this whole "develop" discussion seems to be about locating dist-info directories located elsewhere. Having said that, PEP 376 later states: get_distributions() -> iterator of Distribution instances. Provides an iterator that looks for .dist-info directories in sys.path and returns Distribution instances for each one of them. This implies dist-info directories are searched for in sys.path. OK, fine. That's broader than just site-packages, but still well-defined and acceptable. And that's where I get my expectations that manipulating PYTHONPATH should work. So what's this directory structure we're talking about with Foo containing two packages, and Foo.dist-info being alongside Foo? Foo itself isn't on PYTHONPATH, so why should Foo.dist-info be found at all? Based on PEP 376, it's not meant to be found. Maybe if this *is* a requirement, it needs a change to PEP 376, which I guess means the PEP discussion and approval process needs to be gone through again. I, for one, would be OK with that, as I remain to be convinced that the complexity and confusion is worth it. Paul.

On Oct 21, 2010, at 7:21 AM, Paul Moore wrote:
Using develop does more than just modify the import path. It also generates the meta data, such as entry points, and re-generates any console scripts defined by my setup.py so that they point to the version of code in the sandbox. After I run develop, any Python process on the system using the same python interpreter will run the code in my sandbox instead of the version "installed" in site-packages. That includes any of the command line programs or plugins defined in my setup.py, and even applies to processes that don't run as my user. I use these features every day, since our application depends on a few daemons that run as root (it's a system management app, so it needs root privileges to do almost anything interesting). Doug

On 21 October 2010 13:14, Doug Hellmann <doug.hellmann@gmail.com> wrote:
Note - my understanding is that this discussion is about metadata discovery for distutils2, *not* about setuptools' develop feature (which AIUI does far more than is being proposed at the moment). Specifically, I thought we were just talking about metadata here. As far as this discussion goes, entry points and console scripts aren't included. That's not to say they aren't useful, just that they are a separate discussion. In case it's not obvious, I'm a strong -1 on simply importing setuptools functionality into distutils2 wholesale, without discussion/review. Paul.

On Wed, Oct 20, 2010 at 6:35 PM, Floris Bruynooghe <flub@devork.be> wrote:
pip uninstall would unlink it (pip install -e calls setup.py develop as well). setup.py develop is persistent unlike PYTHONPATH.
Hence virtualenv, which solves your other concerns.
Note if you use pip, it uses setuptools in a way where only setup.py develop uses .pth files, and otherwise the path is similar to how it is with distutils alone (except with that extra metadata, as Doug mentions). -- Ian Bicking | http://blog.ianbicking.org

On Wed, Oct 20, 2010 at 8:36 AM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
So do it the same way as Setuptools -- setup.py egg_info writes the info to the root of the packages (which might be src/ for some libraries) and when that is added to the path, then the directory will be scanned and the metadata found. And setup.py develop calls egg_info. Replace egg with dist and it's all good, right? -- Ian Bicking | http://blog.ianbicking.org

On Wed, Oct 20, 2010 at 6:02 PM, Ian Bicking <ianb@colorstudy.com> wrote:
Not quite, since packages can be located in other (and several) places than directly there. (See my answer to Paul) So I am trying to write this options_to_paths() code to see how things can work
-- Tarek Ziadé | http://ziade.org

On Wed, Oct 20, 2010 at 7:57 PM, Paul Moore <p.f.moore@gmail.com> wrote:
A different idea along these lines that I've been pondering is an actual -p path option for the interpreter command line, that allowed a sequence of directories to be provided that would be prepended to PYTHONPATH (and hence included in sys.path). So if you're wanting to test two different versions of a module (from a parent directory containing the two versions in separate subdirectories): python -p versionA run_tests.py python -p versionB run_tests.py For more permanent additions to sys.path, PYTHONPATH (possibly in conjunction with virtualenv) is reasonable answer. Zipfile and directory execution covers execution of more complex applications containing multiple files as if they were simple scripts. The main piece I see missing from the puzzle is the ability to easily switch back and forth between multiple versions of a support package or library without mucking with persistent state like the environment variables or the filesystem. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Oct 19, 2010 at 4:26 PM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
The link file is a red herring -- setuptools adds an entry to easy-install.pth that points to the directory. It would work equally as well to add a .pth file for the specific package (though .pth files append to the path, so if you already have a package installed and then a .pth file pointing to a development version, then it won't work as expected, hence the magic in easy-install.pth). -- Ian Bicking | http://blog.ianbicking.org

On Wed, Oct 20, 2010 at 12:03 AM, Ian Bicking <ianb@colorstudy.com> wrote:
Yes, or a develop.pth file containing those paths, like Carl proposed on IRC. a .cfg is not really helping indeed. But we would need to have the metadata built and stored somewhere. A specific directory maybe for them.
-- Ian Bicking | http://blog.ianbicking.org
-- Tarek Ziadé | http://ziade.org

On 19 October 2010 22:26, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
Can you explain the requirement in more detail? I don't use the setuptools develop command, so I don't have the background, but it seems to me that what you're proposing can be done simply by adding the relevant directory to PYTHONPATH. That's all I ever do when developing (but my needs are pretty simple, so there may well be subtle problems with that approach). Paul

On Wed, Oct 20, 2010 at 11:57 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Sorry that was vague indeed. It goes a little bit farther than than: the project packages and modules have to be found in the path, but we also need to publish the project metadata that would be installed in a normal installation, so our browsing/query APIs can find the project. So, if a project 'Boo' has two packages 'foo' and 'bar' and a module 'baz.py', we need those in the path but also the Boo.dist-info directory that is created at installation time (see PEP 376). Setuptools' metadata directory is called Boo.egg-info, and distutils 1 has a file called Boo.egg-info since python 2.5 And since a python project can publish several top level directories, all of them needs to be added in the path. so adding the current dir to PYTHONPATH will not work in every case even if the metadata are built and dropped there. I am not sure what would be the best way to handle this, maybe having these metadata built in place, then listing all the paths that need to be included and write them to a .pth file Distutils2 manage. So: 0. have a distutils2.pth file installed with distutils2 Then, to add the project in the path: 1. build the project metadata in-place 2. get the project paths by listing its packages and directories (by invoking a pseudo-install command) 3. inject these paths in distutils2.pth To remove it: 1. get the project paths by listing its packages and directories 2. remove these paths from distutils2.pth Another problem I see is that any module or package that is not listed by the project and that would not be installed in the site-packages might be added in the path, but that's probably not a huge issue. The goal is to be able to avoid re-installing a project you are working on to try it, every time you make a change. This is used a lot, and in particular with virtualenv. So in any case, it turns out .pth files are a good way to do this so I guess this thread does not belong to python-ideas anymore. Cross-posting to the D2 Mailing list to move it there ! Tarek -- Tarek Ziadé | http://ziade.org

On 20 October 2010 14:36, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
Maybe I'm still missing something, but are you saying that the metadata query APIs don't respect PYTHONPATH? Is there are reason why they can't?
... and I'd expect the dist-info directory to be located by searching PYTHONPATH
So, project Foo publishes packages bar and baz. MyDir Foo __init__.py bar __init__.py baz __init__.py Foo-N.M-pyx.y.dist-info (Is that right? I'm rusty on the structure. That's how it looks in Python 2.7) So the directory MyDir is on PYTHONPATH. Then Foo.bar and Foo.baz are visible, and the dist-info file is on PYTHONPATH for introspection. If you're saying that Foo *isn't* a package itself, so Foo/__init__.py doesn't exist, and bar and baz should be visible unqualified, then I begin to see your issue (although my first reaction is to say "don't do that, then" :-)). But don't you then just need to search *parents* of elements of PYTHONPATH as well for the metadata search? If that's an issue then doesn't that mean you've got other problems with how people structure their directories? Actually, I suspect my picture above is wrong, as I can't honestly see that mandating that the dist-info file be a *sibling* (in an arbitrarily cluttered directory) of the project directory, is sensible... But I'm probably not seeing the real issues here. All I would say is, don't let the needs of more unusual configurations over-complicate basic usage. Paul.

On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore <p.f.moore@gmail.com> wrote: ...
yeah that the main issue: we can't make assumptions on how the source tree looks in the project, so adding the root path will not work all the time. Some people even have two separate root packages. Which is not a good layout, but allowed.. In Zope, I think the convention is to use a src/ directory so that's another level. Since distutils1 and distutils2 will let you provide in their options a list of packages and modules, I think it's the only sane way to get a list of paths we can then add in the path.
The trouble is: adding in PYTHONPATH the root of the source of your project can be different from what it would be once installed in Python. Now the question is: if 90% of the projects out there would work by adding the root, then this is might be overkill. I am afraid it's way less though... Tarek -- Tarek Ziadé | http://ziade.org

On Wed, Oct 20, 2010 at 9:27 AM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
Setuptools puts the files in the src/ directory in that case. More complicated layouts simply aren't supported, and generally no one complains as more complicated layouts are uncommon and a sign someone's head is somewhere very different than where they would be if they were using setup.py develop. -- Ian Bicking | http://blog.ianbicking.org

[sorry, forgot to include the list address before] Hi On 20 October 2010 15:27, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
I've read your and Ian's responses and still don't understand what setup.py develop brings to the party which can't be done with simple PYTHONPATH. Excuse me if I also completely misunderstand what develop does but it sounds like it's going to add an in-development version of a project on a users's sys.path (at the front?) until it's undone again somehow (is there a "setup.py undevelop"?). This just seems dangerous to me since it will affect all python programs run by that user. If I understand correctly this whole "develop" dance is for when you have two inter-depended packages in development at the same time. If manually setting PYTHONPATH correctly in this situation is too complicated then my feeling is there's nothing wrong with some sort of helper which manipulates PYTHONPATH for you, something like spaw a new shell and set the environment in that correctly. But placing things in files makes this permanent for the user and just seems the wrong way to go to me. Again, apologies if I understand the problem wrongly. But I too am worried about too many complexities and "magic". One of my main issues with setuptools is that it tries to handle my python environment (sys.path) outside of normally expected python mechanisms by modifying various custom files. I would hate to see distutils2 repeat this. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org

On 21 October 2010 00:35, Floris Bruynooghe <flub@devork.be> wrote:
I'm glad it's not just me!
I think that the key issue here is that PEP 376 introduces the idea of a "distribution" which is a somewhat vaguely defined concept, which can contain one or more packages or modules. Distributions don't have a well-defined directory structure, and don't participate properly in Python's standard import mechanism (PEP 302, PYTHONPATH, all that stuff). The distribution metadata (dist-info directory) is not package-based, and so doesn't fit the model. Suggestions: 1. PEP 376 clearly defines what a "distribution" (installed or otherwise) is, in terms of directory structure, whether/how it supports PEP302-style non-filesystem access, etc. I don't see a reason here why we can't mandate some structure, rather than leaving things as a "free for all" like the current setuptools/adhoc approach. 2. Mechanisms for dealing with distributions are *only* discussed in terms of the PEP 376 definitions, so we have a common understanding. As a first cut, I'd say that a distribution is defined purely in terms of its metadata (dist-info directory). On that basis, there should be a definition of where dist-info directories are searched for, PEP 376 seems to state that this is only in site-packages ("This PEP proposes an installation format inspired by one of the options in the EggFormats standard, the one that uses a distinct directory located in the site-packages directory."). And yet, this whole "develop" discussion seems to be about locating dist-info directories located elsewhere. Having said that, PEP 376 later states: get_distributions() -> iterator of Distribution instances. Provides an iterator that looks for .dist-info directories in sys.path and returns Distribution instances for each one of them. This implies dist-info directories are searched for in sys.path. OK, fine. That's broader than just site-packages, but still well-defined and acceptable. And that's where I get my expectations that manipulating PYTHONPATH should work. So what's this directory structure we're talking about with Foo containing two packages, and Foo.dist-info being alongside Foo? Foo itself isn't on PYTHONPATH, so why should Foo.dist-info be found at all? Based on PEP 376, it's not meant to be found. Maybe if this *is* a requirement, it needs a change to PEP 376, which I guess means the PEP discussion and approval process needs to be gone through again. I, for one, would be OK with that, as I remain to be convinced that the complexity and confusion is worth it. Paul.

On Oct 21, 2010, at 7:21 AM, Paul Moore wrote:
Using develop does more than just modify the import path. It also generates the meta data, such as entry points, and re-generates any console scripts defined by my setup.py so that they point to the version of code in the sandbox. After I run develop, any Python process on the system using the same python interpreter will run the code in my sandbox instead of the version "installed" in site-packages. That includes any of the command line programs or plugins defined in my setup.py, and even applies to processes that don't run as my user. I use these features every day, since our application depends on a few daemons that run as root (it's a system management app, so it needs root privileges to do almost anything interesting). Doug

On 21 October 2010 13:14, Doug Hellmann <doug.hellmann@gmail.com> wrote:
Note - my understanding is that this discussion is about metadata discovery for distutils2, *not* about setuptools' develop feature (which AIUI does far more than is being proposed at the moment). Specifically, I thought we were just talking about metadata here. As far as this discussion goes, entry points and console scripts aren't included. That's not to say they aren't useful, just that they are a separate discussion. In case it's not obvious, I'm a strong -1 on simply importing setuptools functionality into distutils2 wholesale, without discussion/review. Paul.

On Wed, Oct 20, 2010 at 6:35 PM, Floris Bruynooghe <flub@devork.be> wrote:
pip uninstall would unlink it (pip install -e calls setup.py develop as well). setup.py develop is persistent unlike PYTHONPATH.
Hence virtualenv, which solves your other concerns.
Note if you use pip, it uses setuptools in a way where only setup.py develop uses .pth files, and otherwise the path is similar to how it is with distutils alone (except with that extra metadata, as Doug mentions). -- Ian Bicking | http://blog.ianbicking.org

On Wed, Oct 20, 2010 at 8:36 AM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
So do it the same way as Setuptools -- setup.py egg_info writes the info to the root of the packages (which might be src/ for some libraries) and when that is added to the path, then the directory will be scanned and the metadata found. And setup.py develop calls egg_info. Replace egg with dist and it's all good, right? -- Ian Bicking | http://blog.ianbicking.org

On Wed, Oct 20, 2010 at 6:02 PM, Ian Bicking <ianb@colorstudy.com> wrote:
Not quite, since packages can be located in other (and several) places than directly there. (See my answer to Paul) So I am trying to write this options_to_paths() code to see how things can work
-- Tarek Ziadé | http://ziade.org

On Wed, Oct 20, 2010 at 7:57 PM, Paul Moore <p.f.moore@gmail.com> wrote:
A different idea along these lines that I've been pondering is an actual -p path option for the interpreter command line, that allowed a sequence of directories to be provided that would be prepended to PYTHONPATH (and hence included in sys.path). So if you're wanting to test two different versions of a module (from a parent directory containing the two versions in separate subdirectories): python -p versionA run_tests.py python -p versionB run_tests.py For more permanent additions to sys.path, PYTHONPATH (possibly in conjunction with virtualenv) is reasonable answer. Zipfile and directory execution covers execution of more complex applications containing multiple files as if they were simple scripts. The main piece I see missing from the puzzle is the ability to easily switch back and forth between multiple versions of a support package or library without mucking with persistent state like the environment variables or the filesystem. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (8)
-
Antoine Pitrou
-
Doug Hellmann
-
Floris Bruynooghe
-
Ian Bicking
-
Nick Coghlan
-
Paul Moore
-
Ron Adam
-
Tarek Ziadé