Re: [Distutils] PEP 376 - site-directories and site.addsitedir
On Thu, May 14, 2009 at 12:33 PM, Noah Gift <noah.gift@gmail.com> wrote:
One thing that isn't clear to me is whether this could then lead to a potentially new third party packaging system, or library, that will just do the same thing again, and create a linear scanning algorithm. Is there a way to architect things in a way that will never increase the number of stats to the filesystem or limit them?
I'd like to make a difference between what is "importable" and what is "installed" in Python. what's "installed" for me is what is in a site-packages directory that contains egg-info directories. At startup, you have to visit at least once every site-packages directory, that's linear and that won't go away. But it's OK eventually, with a cache or a memory index that takes care of making the lookup code non-linear after Python has started *as long as sys.path is not involved* So, by restricting this API to site-packages (e.g. "installed"), rather than sys.path (eg "importable"), is just a way of limiting the scope to some specific places where you are supposed to install packages (and where egg-info files are located). When a program is launched, the site-packages places should be limited to: - the python site-packages (wether is central, wether it's local, using virtualenv) - the per user site-packages (PEP 370) - maybe a third place added manually using site.addsitedir()
One other thing that hopefully isn't off topic is the idea of scanning for imports in general. Does it at all seem logical to have an option for python to have zero scanning to load a module? After all if you really know the path to a module, why do you need to scan anything first to find it?
well if it's not loaded already, you have to find it don't you ? And you don't know where it is, you have to locate it (see imporlib and pep 302) But for me that's another story, than locating the metadata info of an installed package. Tarek
2009/5/14 Tarek Ziadé <ziade.tarek@gmail.com>:
I'd like to make a difference between what is "importable" and what is "installed" in Python.
what's "installed" for me is what is in a site-packages directory that contains egg-info directories.
I'm not sure how well-defined that is. Can you precisely define what a "site-packages directory" is? Specifically, what about the Windows registry items under HKLM\Software\Python\PythonCore\x.y\PythonPath? There may be other "special" locations, I can't recall for certain. Paul.
On Thu, 14 May 2009 14:38:07 +0100, Paul Moore <p.f.moore@gmail.com> wrote:
2009/5/14 Tarek Ziadé <ziade.tarek@gmail.com>:
I'm not sure how well-defined that is. Can you precisely define what a "site-packages directory" is?
Site-packages is referenced from the "original" site.py code. It is a place from my reading where all packages common for the python installation are stored. It is the site repository for all extra packages. It's specifically hardcoded with an os.path.join(python-path,"lib","site-packages") command. Specifically, what about the Windows
registry items under HKLM\Software\Python\PythonCore\x.y\PythonPath? There may be other "special" locations, I can't recall for certain.
There's a user-packages location under python 2.6 But it's use hasn't been picked up. If it was under linux then the equivalent would possibly be ~/.user-packages. David
2009/5/14 David Lyon <david.lyon@preisshare.net>:
On Thu, 14 May 2009 14:38:07 +0100, Paul Moore <p.f.moore@gmail.com> wrote:
2009/5/14 Tarek Ziadé <ziade.tarek@gmail.com>:
I'm not sure how well-defined that is. Can you precisely define what a "site-packages directory" is?
Site-packages is referenced from the "original" site.py code.
It is a place from my reading where all packages common for the python installation are stored. It is the site repository for all extra packages.
It's specifically hardcoded with an os.path.join(python-path,"lib","site-packages") command.
Specifically, what about the Windows
registry items under HKLM\Software\Python\PythonCore\x.y\PythonPath? There may be other "special" locations, I can't recall for certain.
There's a user-packages location under python 2.6
But it's use hasn't been picked up. If it was under linux then the equivalent would possibly be ~/.user-packages.
No, the registry stuff is completely different - it has been in Windows Python for a long time (probably back as far as 1.5 or earlier). It's not often used, but it's in addition to PYTHONPATH. I think older versions of pywin32 used it but newer versions don't seem to. I suspect it's not used much - but it's bound to be used in some installations... The main point here is to emphasize that there are some fairly obscure ways for directories to end up in sys.path, even if we exclude user code. The proposal needs to make a statement on how all such cases are handled - I don't have any particular opinion on *what* should happen, just that it gets documented. Paul.
On Thu, 14 May 2009 16:01:55 +0100, Paul Moore <p.f.moore@gmail.com> wrote:
The main point here is to emphasize that there are some fairly obscure ways for directories to end up in sys.path, even if we exclude user code. The proposal needs to make a statement on how all such cases are handled - I don't have any particular opinion on *what* should happen, just that it gets documented.
My take on it is that sys.path get's messed with too easily. Any item added in sys.path causes a full directory search to happen which is fairly time consuming. The best solution in performance is one where sys.path is limited to a handful of entries (say ./, user-packages, site-packages) and any packages that are stored are referenced in .PTH files along those three paths. Of course, I'm being way too simplistic here... and in the real world many systems aren't like that.... yet.... lol but who knows what the future can bring.... David
On Thu, May 14, 2009 at 02:38:07PM +0100, Paul Moore wrote:
2009/5/14 Tarek Ziadé <ziade.tarek@gmail.com>:
I'd like to make a difference between what is "importable" and what is "installed" in Python.
what's "installed" for me is what is in a site-packages directory that contains egg-info directories.
I'm not sure how well-defined that is. Can you precisely define what a "site-packages directory" is? Specifically, what about the Windows registry items under HKLM\Software\Python\PythonCore\x.y\PythonPath? There may be other "special" locations, I can't recall for certain.
I've treated those as an equivalent of the PYTHONPATH environment variable (but one that doesn't require rebooting if you add to it during a system wide installation). So they are not site-packages directories. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org
On Thu, May 14, 2009 at 3:38 PM, Paul Moore <p.f.moore@gmail.com> wrote:
2009/5/14 Tarek Ziadé <ziade.tarek@gmail.com>:
I'd like to make a difference between what is "importable" and what is "installed" in Python.
what's "installed" for me is what is in a site-packages directory that contains egg-info directories.
I'm not sure how well-defined that is. Can you precisely define what a "site-packages directory" is? Specifically, what about the Windows registry items under HKLM\Software\Python\PythonCore\x.y\PythonPath? There may be other "special" locations, I can't recall for certain.
I need to check that, I wasn't aware of it
Paul.
-- Tarek Ziadé | http://ziade.org
Hi all, I've just had a moment to read PEP-376.. I seriously question the need for a new .EGG_INFO directory... Given that a typical site-packages directory doesn't have many files in it anyway. Typically, it will contain 20+ project/package subdirectories or .EGG files/directories. Why not keep the .EGG_INFO files in the site-packages directory? It hardly has many files in it anyway. It seems so much simpler to use site-packages and doesn't add any extra o/s overhead (yes - more directories does slow a system down - slowly - but surely).
From a pure python perspective.. another directory isn't needed.
I think we could do a lot more to manage what we already have... before making new wholes for mess to go in..... David
At 08:34 PM 5/18/2009 -0400, David Lyon wrote:
Why not keep the .EGG_INFO files in the site-packages directory?
That's where they go. Each installed project has its own .egg-info subdirectory containing the listed files. See the EggFormats documentation for details. PEP 376 is just adding stdlib support for the .egg-info format defined by setuptools, and adding a new RECORD file to it.
On Mon, 18 May 2009 20:59:22 -0400, "P.J. Eby" <pje@telecommunity.com> wrote:
At 08:34 PM 5/18/2009 -0400, David Lyon wrote:
Why not keep the .EGG_INFO files in the site-packages directory?
That's where they go. Each installed project has its own .egg-info subdirectory containing the listed files. See the EggFormats documentation for details. PEP 376 is just adding stdlib support for the .egg-info format defined by setuptools, and adding a new RECORD file to it.
Let me quote.... .egg-info becomes a directory ============================= The first change would be to make `.egg-info` a directory and let it hold the `PKG-INFO` file built by the `write_pkg_file` method of the `Distribution` class in Distutils. This change will not impact Python itself, because `egg-info` files are not used anywhere yet in the standard library besides Distutils.
On Tue, May 19, 2009 at 12:59 PM, P.J. Eby <pje@telecommunity.com> wrote:
At 08:34 PM 5/18/2009 -0400, David Lyon wrote:
Why not keep the .EGG_INFO files in the site-packages directory?
That's where they go. Each installed project has its own .egg-info subdirectory containing the listed files. See the EggFormats documentation for details. PEP 376 is just adding stdlib support for the .egg-info format defined by setuptools, and adding a new RECORD file to it.
But if this implementation is the same as eggs, then each egg directory is then scanned and imported into sys.path, unlike normal packages which are called via simple namespaces.
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
-- Cheers, Noah
David Lyon wrote:
I seriously question the need for a new .EGG_INFO directory...
Given that a typical site-packages directory doesn't have many files in it anyway.
Typically, it will contain 20+ project/package subdirectories or .EGG files/directories.
;-) Mine has 213 files/directories, and I use virtualenv and buildout to keep the number down. I also use Gentoo Linux which has distro-packaged lots of Python-based apps, leading to more entries in site-packages than might otherwise be expected in a virtualenv/buildout environment. -Jeff
On Mon, 18 May 2009 21:22:38 -0500, Jeff Rush <jeff@taupro.com> wrote:
;-) Mine has 213 files/directories, and I use virtualenv and buildout to keep the number down.
Ok.. that sounds like it is "in the normal range" How many subdirectories in site-packages is not so important if they are packages. How many regular files? 213 directories and 213 .EGG_INFO files doesn't sound like too much to me. Somebody should actually run some performance tests... to examine the performance impact on any of this..... David
On Thu, 14 May 2009 14:38:33 +0200, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
When a program is launched, the site-packages places should be limited to:
- the python site-packages (wether is central, wether it's local, using virtualenv) - the per user site-packages (PEP 370) - maybe a third place added manually using site.addsitedir()
Yes... To make it even simpler again.... 1) project-packages (the current directory) 2) user-packages 3) site-packages This would make things simpler and a lot faster. btw, this is how I am building package installation into my package manager gui. Regards David
participants (7)
-
David Lyon
-
Floris Bruynooghe
-
Jeff Rush
-
Noah Gift
-
P.J. Eby
-
Paul Moore
-
Tarek Ziadé