Some clarifications and/or corrections to PEP 376
I've been looking at the details of PEP 376 (Database of Installed Python Distributions) and there are a couple of aspects that I don't think work properly alongside PEP 370 (Per-user site-packages directory).. 1. The dist-info directory for a distribution is stated as being "located in the site-packages directory". It's not clear how this is intended to work in a PEP 370 world with multiple site-packages. What I propose is that the description be changed to be worded in terms of sysconfig-style locations: the dist-info directory is located in whichever of purelib or platlib is used by the distribution. When the distribution uses both, purelib is preferred, when it uses neither (!) purelib is used. In nearly all cases, this is the same as currently. The exceptions are posix_home (where the directory name isn't "site-packages" but its function is the same), posix_prefix (where purelib and platlib differ, and PEP 376 is currently ambiguous as to which is implied), and any custom schemes that might be created (where PEP 376 is silent, and this proposal has the benefit of at least being specific). I do not believe this changes any actual practice - as far as I have been able to determine any code using dist-info at the moment follows this proposal in the corner cases where it differs from PEP 376. 2. File paths in RECORD are stated as being relative to the "base location" as long as the files sit under that location, or under the "install prefix" where that is above the base location. That's a messy but precise way of saying you can go up a bit as long as you stay within sys.prefix. (Other paths are platform format absolute paths, which is an ugly discrepancy but the only really viable option). Again, this doesn't work as I'd expect it should in a PEP 370 user site-packages. In this case, the answer isn't quite as clear. I would suggest, looking at the schemes defined in sysconfig, allowing any files located in one of the defined sysconfig paths (for the current scheme) to be recorded as relative paths. The paths should be relative to whichever of purelib and platlib was chosen in (1) above. The exception is that where pureXXX and platXXX do not use the same base (posix_prefix and possibly custom schemas) it should not be allowed to have a relative path from a pureXXX to a platXXX or vice versa. That equates to something slightly stricter than the current scheme (although not in a way that's likely to be used in practice), but using the appropriate one of {base} or {userbase} rather than sys.prefix. As noted, the two exceptions are custom schemes (where PEP 376 is silent and would probably end up effectively mandating platform absolute paths throughout) and posix_prefix, where platlib/platinclude/platstdlib are located under {platbase} rather than {base} and the current proposal would probably allow relative paths to be used a little more often that PEP 376 allows. Current practice that I've seen here is not as clear cut, but mostly in the sense that code doesn't consider the corner cases at all, and probably only follows PEP 376 by luck, if at all... Comments would be appreciated. I'm pretty happy that my proposal for (1) is an improvement. With (2) I'm concerned about the posix_prefix case, particularly as I'm a Windows developer and my understanding of Python's layout on POSIX is limited. But I honestly don't think it's possible to do much better than my proposal, and I do think that PEP 376 as it stands now is distinctly worse because it doesn't consider PEP 370 or the posix_prefix case. Also, while these proposals are not 100% backward compatible, I don't believe they change behaviour in the most common scenarios, and given that PEP 376 currently has limited adoption (mainly in distutils2, I believe) I'd suggest that the minor compatibility break now is better than keeping the current system until it's more widely adopted and there are more users to impact. Obviously I'd be looking for views from Tarek and Éric in particular here, as it's their code in distutils2 that would be directly impacted. Paul.
I prefer that paths in record are always relative to the parent directory of record (usually site-packages) unless a relative path would not work (drive letter boundaries). Consider installing and then chroot.
On 6 September 2012 22:18, Daniel Holth <dholth@gmail.com> wrote:
I prefer that paths in record are always relative to the parent directory of record (usually site-packages) unless a relative path would not work (drive letter boundaries). Consider installing and then chroot.
That's a much simpler rule and I agree in principle that it's preferable. The problems are: 1. It's much further away from what PEP 376 specifies. I will accept this happily if there is general agreement that it's OK, but I preferred to start with a more conservative suggestion :-) 2. Do you really want long strings of ../../.. if a distribution specifies a file to be installed in an absolute location (possible, although probably not well supported by current tools). Consider a package that installs something to /var/python (I'm not a Unix user, so this may be an unconvincing example, but I understand that similar things *are* possible). If Python is in /usr, you'd have RECORD with something like ../../../../var/python. I don't have enough Unix experience to know if anyone would care about this. Of course what I know about chroot implies this would break in that scenario anyway... As I say, if the Unix people are OK with it, I'm happy to go this way. Of course, I'd be happy to mandate that files in a distribution should never be installed anywhere that isn't defined as one of the sysconfig defined paths - but suggesting that would be sure to start a much bigger debate that I don't want to get into as I don't have the expertise to referee it. Paul.
On Sep 6, 2012 5:33 PM, "Paul Moore" <p.f.moore@gmail.com> wrote:
On 6 September 2012 22:18, Daniel Holth <dholth@gmail.com> wrote:
I prefer that paths in record are always relative to the parent
directory of
record (usually site-packages) unless a relative path would not work (drive letter boundaries). Consider installing and then chroot.
That's a much simpler rule and I agree in principle that it's preferable. The problems are:
1. It's much further away from what PEP 376 specifies. I will accept this happily if there is general agreement that it's OK, but I preferred to start with a more conservative suggestion :-) 2. Do you really want long strings of ../../.. if a distribution specifies a file to be installed in an absolute location (possible, although probably not well supported by current tools). Consider a package that installs something to /var/python (I'm not a Unix user, so this may be an unconvincing example, but I understand that similar things *are* possible). If Python is in /usr, you'd have RECORD with something like ../../../../var/python. I don't have enough Unix experience to know if anyone would care about this. Of course what I know about chroot implies this would break in that scenario anyway...
Fyi in pip all the installed - files paths are relative to installed-files.txt itself - even more dots.
As I say, if the Unix people are OK with it, I'm happy to go this way.
Of course, I'd be happy to mandate that files in a distribution should never be installed anywhere that isn't defined as one of the sysconfig defined paths - but suggesting that would be sure to start a much bigger debate that I don't want to get into as I don't have the expertise to referee it.
Paul.
Hi, I was just trying to implement PEP 376 and saw the same problems. At Thu, 6 Sep 2012 22:33:53 +0100, Paul Moore wrote:
On 6 September 2012 22:18, Daniel Holth <dholth@gmail.com> wrote:
I prefer that paths in record are always relative to the parent directory of record (usually site-packages) unless a relative path would not work (drive letter boundaries). Consider installing and then chroot.
That's a much simpler rule and I agree in principle that it's preferable. The problems are:
1. It's much further away from what PEP 376 specifies. I will accept this happily if there is general agreement that it's OK, but I preferred to start with a more conservative suggestion :-)
What PEP 376 currently specifies doesn't really work (or I don't understand it correctly). It specifies that the .dist-info directory should be installed in site-packages, but the file paths in RECORD are relative to the "base location". The "base location" is defined by the --install-lib option that only defaults to the site-packages directory. So if I change the "base location" to be something else using --install-lib the file paths in RECORD will be relative to that directory (if it's also under prefix), but there is no way to figure out what the directory is from the information in the .dist-info directory. The example RECORD file in PEP 376 also seems wrong to me, because the paths look relative to sys.prefix instead of relative to site-packages directory (and the docutils-0.5.dist-info is also missing a lib/). The example of dist.get_installed_files that is supposed to return the contents of RECORD also shows different paths than the first example, but those are also not relative to the site-packages directory. I agree that having the paths relative to the parent directory of the .dist-info directory is preferable. It's easy to implement and I don't really see any downsides at the moment.
2. Do you really want long strings of ../../.. if a distribution specifies a file to be installed in an absolute location (possible, although probably not well supported by current tools). Consider a package that installs something to /var/python (I'm not a Unix user, so this may be an unconvincing example, but I understand that similar things *are* possible). If Python is in /usr, you'd have RECORD with something like ../../../../var/python. I don't have enough Unix experience to know if anyone would care about this. Of course what I know about chroot implies this would break in that scenario anyway...
As I say, if the Unix people are OK with it, I'm happy to go this way.
I think it is very rare that a distribution would need to install to an absolute path. What might be more common is having to refer to the /usr/lib and /usr/bin directories from /usr/share, but I don't see any problems with using relative paths for that.
Of course, I'd be happy to mandate that files in a distribution should never be installed anywhere that isn't defined as one of the sysconfig defined paths - but suggesting that would be sure to start a much bigger debate that I don't want to get into as I don't have the expertise to referee it.
I think it's a pretty good suggestion to have packages only install files in a set of predefined locations, but we would have to make sure that all cases are covered by the sysconfig defined paths. Support for things like manpages seem to be missing for example. Kind regards, Jeroen Dekkers
Jeroen Dekkers <jeroen <at> dekkers.ch> writes:
I agree that having the paths relative to the parent directory of the .dist-info directory is preferable. It's easy to implement and I don't really see any downsides at the moment.
Perhaps not on POSIX, but on Windows things don't fit nicely in FHS-like schema. For example, if you need to install PowerShell scripts, they will not be able to be shoe-horned into somewhere under site-packages, as PowerShell looks in specific (other) places. I agree that PEP 376 isn't ideal in what it specifies in this area. The simplest solution would surely be absolute paths only. What are the downsides apart from the disk space used for the extra lengths of the filenames? Regards, Vinay Sajip
At Mon, 31 Dec 2012 18:59:19 +0000 (UTC), Vinay Sajip wrote:
Jeroen Dekkers <jeroen <at> dekkers.ch> writes:
I agree that having the paths relative to the parent directory of the .dist-info directory is preferable. It's easy to implement and I don't really see any downsides at the moment.
Perhaps not on POSIX, but on Windows things don't fit nicely in FHS-like schema. For example, if you need to install PowerShell scripts, they will not be able to be shoe-horned into somewhere under site-packages, as PowerShell looks in specific (other) places.
That is also the case for POSIX, where the scripts should be installed to /usr/bin and the site-packages directory is something like /usr/lib/python2.7/site-packages. But I forgot that you can't use relative paths if things are on a different drive with Windows, so I agree that using only relative paths isn't a good idea.
I agree that PEP 376 isn't ideal in what it specifies in this area. The simplest solution would surely be absolute paths only. What are the downsides apart from the disk space used for the extra lengths of the filenames?
I can see the merits of using relative paths in some use cases, but I also see the problems of using them in other cases. Maybe we should just allow both. We can specify that paths in RECORDS can be relative to the parent directory of the .dist-info directory or absolute and both must be supported by installation tools. Whether relative or absolute paths are used is decided by the tool that creates/modifies the RECORDS file. That seems easy enough to implement, while flexible enough to support the different use cases. Kind regards, Jeroen Dekkers
Jeroen Dekkers <jeroen <at> dekkers.ch> writes:
We can specify that paths in RECORDS can be relative to the parent directory of the .dist-info directory or absolute and both must be supported by installation tools. Whether relative or absolute paths are used is decided by the tool that creates/modifies the RECORDS file.
That seems easy enough to implement, while flexible enough to support the different use cases.
Yes, and I've now implemented this in distlib. Tests are still to be added to cover all cases, but I am now able to upgrade / uninstall pip-installed dists, which seems promising. Regards, Vinay Sajip
On 7 January 2013 14:40, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Jeroen Dekkers <jeroen <at> dekkers.ch> writes:
We can specify that paths in RECORDS can be relative to the parent directory of the .dist-info directory or absolute and both must be supported by installation tools. Whether relative or absolute paths are used is decided by the tool that creates/modifies the RECORDS file.
That seems easy enough to implement, while flexible enough to support the different use cases.
Yes, and I've now implemented this in distlib. Tests are still to be added to cover all cases, but I am now able to upgrade / uninstall pip-installed dists, which seems promising.
OK, that sounds like a good approach then. It might be worth noting that relative paths are to be preferred wherever sensible (but that's a recommendation only, not a requirement). But as I noted at the start of this thread, it is *not* what PEP 376 states at present. Do we need an update to the PEP if that is to become the "de facto" standard? Also, the first question I posed in my initial posting in this thread appears to remain unanswered: """ 1. The dist-info directory for a distribution is stated as being "located in the site-packages directory". It's not clear how this is intended to work in a PEP 370 world with multiple site-packages. What I propose is that the description be changed to be worded in terms of sysconfig-style locations: the dist-info directory is located in whichever of purelib or platlib is used by the distribution. When the distribution uses both, purelib is preferred, when it uses neither (!) purelib is used. In nearly all cases, this is the same as currently. The exceptions are posix_home (where the directory name isn't "site-packages" but its function is the same), posix_prefix (where purelib and platlib differ, and PEP 376 is currently ambiguous as to which is implied), and any custom schemes that might be created (where PEP 376 is silent, and this proposal has the benefit of at least being specific). I do not believe this changes any actual practice - as far as I have been able to determine any code using dist-info at the moment follows this proposal in the corner cases where it differs from PEP 376. """ Paul.
From: Paul Moore <p.f.moore@gmail.com>
OK, that sounds like a good approach then. It might be worth noting that relative paths are to be preferred wherever sensible (but that's a recommendation only, not a requirement). But as I noted at the start of this thread, it is *not* what PEP 376 states at present. Do we need an update to the PEP if that is to become the "de facto" standard?
PEP 376 states that the path is either absolute, using os.sep, or relative to the "base location", using '/'. The base location is site-packages, or wherever was specified as --install-lib. The example extract from RECORD is also misleading, since the relative paths in there are not relative to site-packages. The location of .dist-info is mentioned as being site-packages, but that didn't exactly jump out, so it could be clearer.
Also, the first question I posed in my initial posting in this thread appears to remain unanswered:
""" 1. The dist-info directory for a distribution is stated as being "located in the site-packages directory".
It's not clear how this is intended to work in a PEP 370 world with multiple site-packages. What I propose is that the description be
Presumably you'd select the one based on the (current user and the) version of the running Python.
changed to be worded in terms of sysconfig-style locations: the dist-info directory is located in whichever of purelib or platlib is used by the distribution. When the distribution uses both, purelib is preferred, when it uses neither (!) purelib is used. In nearly all
What kind of distributions would use neither purelib nor platlib?
cases, this is the same as currently. The exceptions are posix_home (where the directory name isn't "site-packages" but its function is the same), posix_prefix (where purelib and platlib differ, and PEP 376 is currently ambiguous as to which is implied), and any custom schemes that might be created (where PEP 376 is silent, and this proposal has the benefit of at least being specific). I do not believe this changes any actual practice - as far as I have been able to determine any code using dist-info at the moment follows this proposal in the corner cases where it differs from PEP 376. """
On what basis would you pick anything other than purelib to install your distribution into? Just that your distribution contains C extensions, or other additional criteria? Regards, Vinay Sajip
On 7 January 2013 16:37, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
changed to be worded in terms of sysconfig-style locations: the dist-info directory is located in whichever of purelib or platlib is used by the distribution. When the distribution uses both, purelib is preferred, when it uses neither (!) purelib is used. In nearly all
What kind of distributions would use neither purelib nor platlib?
I'm not aware of any, and I think it would be pretty bizarre - that was the point of the "(!)". I only mentioned it for completeness.
cases, this is the same as currently. The exceptions are posix_home (where the directory name isn't "site-packages" but its function is the same), posix_prefix (where purelib and platlib differ, and PEP 376 is currently ambiguous as to which is implied), and any custom schemes that might be created (where PEP 376 is silent, and this proposal has the benefit of at least being specific). I do not believe this changes any actual practice - as far as I have been able to determine any code using dist-info at the moment follows this proposal in the corner cases where it differs from PEP 376. """
On what basis would you pick anything other than purelib to install your distribution into? Just that your distribution contains C extensions, or other additional criteria?
I believe "it uses C extensions" is the criterion, but it's something that is decided internally by distutils, as far as I am aware, and I have never braved the code to determine the exact rules. As a package consumer, and in particular one working on a platform where purelib=platlib, I only care insofar as I want a well-defined rule that doesn't break regardless of which is used, and can be accepted by people who work in environments where it *does* matter. Paul.
participants (4)
-
Daniel Holth
-
Jeroen Dekkers
-
Paul Moore
-
Vinay Sajip