I propose the following PEP for inclusion to Python 3.1. Please comment. Regards, Martin Abstract ======== Namespace packages are a mechanism for splitting a single Python package across multiple directories on disk. In current Python versions, an algorithm to compute the packages __path__ must be formulated. With the enhancement proposed here, the import machinery itself will construct the list of directories that make up the package. Terminology =========== Within this PEP, the term package refers to Python packages as defined by Python's import statement. The term distribution refers to separately installable sets of Python modules as stored in the Python package index, and installed by distutils or setuptools. The term vendor package refers to groups of files installed by an operating system's packaging mechanism (e.g. Debian or Redhat packages install on Linux systems). The term portion refers to a set of files in a single directory (possibly stored in a zip file) that contribute to a namespace package. Namespace packages today ======================== Python currently provides the pkgutil.extend_path to denote a package as a namespace package. The recommended way of using it is to put:: from pkgutil import extend_path __path__ = extend_path(__path__, __name__) int the package's ``__init__.py``. Every distribution needs to provide the same contents in its ``__init__.py``, so that extend_path is invoked independent of which portion of the package gets imported first. As a consequence, the package's ``__init__.py`` cannot practically define any names as it depends on the order of the package fragments on sys.path which portion is imported first. As a special feature, extend_path reads files named ``*.pkg`` which allow to declare additional portions. setuptools provides a similar function pkg_resources.declare_namespace that is used in the form:: import pkg_resources pkg_resources.declare_namespace(__name__) In the portion's __init__.py, no assignment to __path__ is necessary, as declare_namespace modifies the package __path__ through sys.modules. As a special feature, declare_namespace also supports zip files, and registers the package name internally so that future additions to sys.path by setuptools can properly add additional portions to each package. setuptools allows declaring namespace packages in a distribution's setup.py, so that distribution developers don't need to put the magic __path__ modification into __init__.py themselves. Rationale ========= The current imperative approach to namespace packages has lead to multiple slightly-incompatible mechanisms for providing namespace packages. For example, pkgutil supports ``*.pkg`` files; setuptools doesn't. Likewise, setuptools supports inspecting zip files, and supports adding portions to its _namespace_packages variable, whereas pkgutil doesn't. In addition, the current approach causes problems for system vendors. Vendor packages typically must not provide overlapping files, and an attempt to install a vendor package that has a file already on disk will fail or cause unpredictable behavior. As vendors might chose to package distributions such that they will end up all in a single directory for the namespace package, all portions would contribute conflicting __init__.py files. Specification ============= Rather than using an imperative mechanism for importing packages, a declarative approach is proposed here, as an extension to the existing ``*.pkg`` mechanism. The import statement is extended so that it directly considers ``*.pkg`` files during import; a directory is considered a package if it either contains a file named __init__.py, or a file whose name ends with ".pkg". In addition, the format of the ``*.pkg`` file is extended: a line with the single character ``*`` indicates that the entire sys.path will be searched for portions of the namespace package at the time the namespace packages is imported. Importing a package will immediately compute the package's __path__; the ``*.pkg`` files are not considered anymore after the initial import. If a ``*.pkg`` package contains an asterisk, this asterisk is prepended to the package's __path__ to indicate that the package is a namespace package (and that thus further extensions to sys.path might also want to extend __path__). At most one such asterisk gets prepended to the path. extend_path will be extended to recognize namespace packages according to this PEP, and avoid adding directories twice to __path__. No other change to the importing mechanism is made; searching modules (including __init__.py) will continue to stop at the first module encountered. Discussion ========== With the addition of ``*.pkg`` files to the import mechanism, namespace packages can stop filling out the namespace package's __init__.py. As a consequence, extend_path and declare_namespace become obsolete. It is recommended that distributions put a file <distribution>.pkg into their namespace packages, with a single asterisk. This allows vendor packages to install multiple portions of namespace package into a single directory, with no risk of overlapping files. Namespace packages can start providing non-trivial __init__.py implementations; to do so, it is recommended that a single distribution provides a portion with just the namespace package's __init__.py (and potentially other modules that belong to the namespace package proper). The mechanism is mostly compatible with the existing namespace mechanisms. extend_path will be adjusted to this specification; any other mechanism might cause portions to get added twice to __path__. Copyright ========= This document has been placed in the public domain.
At 10:32 AM 4/2/2009 -0500, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1. Please comment.
An excellent idea. One thing I am not 100% clear on, is how to get additions to sys.path to work correctly with this. Currently, when pkg_resources adds a new egg to sys.path, it uses its existing registry of namespace packages in order to locate which packages need __path__ fixups. It seems under this proposal that it would have to scan sys.modules for objects with __path__ attributes that are lists that begin with a '*', instead... which is a bit troubling because sys.modules doesn't always only contain module objects. Many major frameworks place lazy module objects, and module proxies or wrappers of various sorts in there, so scanning through it arbitrarily is not really a good idea. Perhaps we could add something like a sys.namespace_packages that would be updated by this mechanism? Then, pkg_resources could check both that and its internal registry to be both backward and forward compatible. Apart from that, this mechanism sounds great! I only wish there was a way to backport it all the way to 2.3 so I could drop the messy bits from setuptools.
Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1. Please comment.
Would this support the following case: I have a package called mortar, which defines useful stuff: from mortar import content, ... I now want to distribute large optional chunks separately, but ideally so that the following will will work: from mortar.rbd import ... from mortar.zodb import ... from mortar.wsgi import ... Does the PEP support this? The only way I can currently think to do this would result in: from mortar import content,.. from mortar_rbd import ... from mortar_zodb import ... from mortar_wsgi import ... ...which looks a bit unsightly to me. cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
P.J. Eby wrote:
Apart from that, this mechanism sounds great! I only wish there was a way to backport it all the way to 2.3 so I could drop the messy bits from setuptools.
Maybe we could? :-) Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
On 2009-04-02 17:32, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1.
Thanks for picking this up. I'd like to extend the proposal to Python 2.7 and later.
Please comment.
Regards, Martin
Specification =============
Rather than using an imperative mechanism for importing packages, a declarative approach is proposed here, as an extension to the existing ``*.pkg`` mechanism.
The import statement is extended so that it directly considers ``*.pkg`` files during import; a directory is considered a package if it either contains a file named __init__.py, or a file whose name ends with ".pkg".
That's going to slow down Python package detection a lot - you'd replace an O(1) test with an O(n) scan. Alternative Approach: --------------------- Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ? This would also avoid any issues you'd otherwise run into if you want to maintain this scheme in an importer that doesn't have access to a list of files in a package directory, but is well capable for the checking the existence of a file. Mechanism: ---------- If the import mechanism finds a matching namespace package (a directory with a __pkg__.py file), it then goes into namespace package scan mode and scans the complete sys.path for more occurrences of the same namespace package. The import loads all __pkg__.py files of matching namespace packages having the same package name during the search. One of the namespace packages, the defining namespace package, will have to include a __init__.py file. After having scanned all matching namespace packages and loading the __pkg__.py files in the order of the search, the import mechanism then sets the packages .__path__ attribute to include all namespace package directories found on sys.path and finally executes the __init__.py file. (Please let me know if the above is not clear, I will then try to follow up on it.) Discussion: ----------- The above mechanism allows the same kind of flexibility we already have with the existing normal __init__.py mechanism. * It doesn't add yet another .pth-style sys.path extension (which are difficult to manage in installations). * It always uses the same naive sys.path search strategy. The strategy is not determined by some file contents. * The search is only done once - on the first import of the package. * It's possible to have a defining package dir and add-one package dirs. * Namespace packages are easy to recognize by testing for a single resource. * Namespace __pkg__.py modules can provide extra meta-information, logging, etc. to simplify debugging namespace package setups. * It's possible to freeze such setups, to put them into ZIP files, or only have parts of it in a ZIP file and the other parts in the file-system. Caveats: * Changes to sys.path will not result in an automatic rescan for additional namespace packages, if the package was already loaded. However, we could have a function to make such a rescan explicit. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 02 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
At 10:33 PM 4/2/2009 +0200, M.-A. Lemburg wrote:
That's going to slow down Python package detection a lot - you'd replace an O(1) test with an O(n) scan.
I thought about this too, but it's pretty trivial considering that the only time it takes effect is when you have a directory name that matches the name you're importing, and that it will only happen once for that directory, unless there is no package on sys.path with that name, and the program tries to import the package multiple times. In other words, the overhead isn't likely to be much, compared to the time needed to say, open and marshal even a trivial __init__.py file.
Alternative Approach: ---------------------
Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ?
I thought the same thing (or more precisely, a single .pkg file), but when I got lower in the PEP I saw the reason was to support system packages not having overlapping filenames. The PEP could probably be a little clearer about the connection between needing *.pkg and the system-package use case.
One of the namespace packages, the defining namespace package, will have to include a __init__.py file.
Note that there is no such thing as a "defining namespace package" -- namespace package contents are symmetrical peers.
The above mechanism allows the same kind of flexibility we already have with the existing normal __init__.py mechanism.
* It doesn't add yet another .pth-style sys.path extension (which are difficult to manage in installations).
* It always uses the same naive sys.path search strategy. The strategy is not determined by some file contents.
The above are also true for using only a '*' in .pkg files -- in that event there are no sys.path changes. (Frankly, I'm doubtful that anybody is using extend_path and .pkg files to begin with, so I'd be fine with a proposal that instead used something like '.nsp' files that didn't even need to be opened and read -- which would let the directory scan stop at the first .nsp file found.
* The search is only done once - on the first import of the package.
I believe the PEP does this as well, IIUC.
* It's possible to have a defining package dir and add-one package dirs.
Also possible in the PEP, although the __init__.py must be in the first such directory on sys.path. (However, such "defining" packages are not that common now, due to tool limitations.)
Martin v. Löwis schrieb:
I propose the following PEP for inclusion to Python 3.1. Please comment.
Regards, Martin
Abstract ========
Namespace packages are a mechanism for splitting a single Python package across multiple directories on disk. In current Python versions, an algorithm to compute the packages __path__ must be formulated. With the enhancement proposed here, the import machinery itself will construct the list of directories that make up the package.
+1 speaking as a downstream packaging python for Debian/Ubuntu I welcome this approach. The current practice of shipping the very same file (__init__.py) in different packages leads to conflicts for the installation of these packages (this is not specific to dpkg, but is true for rpm packaging as well). Current practice of packaging (for downstreams) so called "name space packages" is: - either to split out the namespace __init__.py into a separate (linux distribution) package (needing manual packaging effort for each name space package) - using downstream specific packaging techniques to handle conflicting files (diversions) - replicating the current behaviour of setuptools simply overwriting the file conflicts. Following this proposal (downstream) packaging of namespace packages is made possible independent of any manual downstream packaging decisions or any downstream specific packaging decisions. Matthias
At 03:21 AM 4/3/2009 +0200, Matthias Klose wrote:
+1 speaking as a downstream packaging python for Debian/Ubuntu I welcome this approach. The current practice of shipping the very same file (__init__.py) in different packages leads to conflicts for the installation of these packages (this is not specific to dpkg, but is true for rpm packaging as well). Current practice of packaging (for downstreams) so called "name space packages" is: - either to split out the namespace __init__.py into a separate (linux distribution) package (needing manual packaging effort for each name space package) - using downstream specific packaging techniques to handle conflicting files (diversions) - replicating the current behaviour of setuptools simply overwriting the file conflicts. Following this proposal (downstream) packaging of namespace packages is made possible independent of any manual downstream packaging decisions or any downstream specific packaging decisions
A clarification: setuptools does not currently install the __init__.py file when installing in --single-version-externally-managed or --root mode. Instead, it uses a project-version-nspkg.pth file that essentially simulates a variation of Martin's .pkg proposal, by abusing .pth file support. If this PEP is adopted, setuptools would replace its nspkg.pth file with a .pkg file on Python versions that provide native support for .pkg imports, keeping the .pth file only for older Pythons. (.egg files and directories will not be affected by the change, unless the zipimport module will also supports .pkg files... and again, only for Python versions that support the new approach.)
Perhaps we could add something like a sys.namespace_packages that would be updated by this mechanism? Then, pkg_resources could check both that and its internal registry to be both backward and forward compatible.
I could see no problem with that, so I have added this to the PEP. Thanks for the feedback, Martin
Chris Withers wrote:
Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1. Please comment.
Would this support the following case:
I have a package called mortar, which defines useful stuff:
from mortar import content, ...
I now want to distribute large optional chunks separately, but ideally so that the following will will work:
from mortar.rbd import ... from mortar.zodb import ... from mortar.wsgi import ...
Does the PEP support this?
That's the primary purpose of the PEP. You can do this today already (see the zope package, and the reference to current techniques in the PEP), but the PEP provides a cleaner way. In each chunk (which the PEP calls portion), you had a structure like this: mortar/ mortar/rbd.pkg (contains just "*") mortar/rbd.py or mortar/ mortar/zobd.pkg mortar/zobd/ mortar/zobd/__init__.py mortar/zobd/backends.py As a site effect, you can also do "import mortar", but that would just give you the (nearly) empty namespace package, whose only significant contents is the variable __path__. Regards, Martin
I'd like to extend the proposal to Python 2.7 and later.
I don't object, but I also don't want to propose this, so I added it to the discussion. My (and perhaps other people's) concern is that 2.7 might well be the last release of the 2.x series. If so, adding this feature to it would make 2.7 an odd special case for users and providers of third party tools.
That's going to slow down Python package detection a lot - you'd replace an O(1) test with an O(n) scan.
I question that claim. In traditional Unix systems, the file system driver performs a linear search of the directory, so it's rather O(n)-in-kernel vs. O(n)-in-Python. Even for advanced file systems, you need at least O(log n) to determine whether a specific file is in a directory. For all practical purposes, the package directory will fit in a single disk block (containing a single .pkg file, and one or few subpackages), making listdir complete as fast as stat.
Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ?
Again - this wouldn't be O(1). More importantly, it breaks system packages, which now again have to deal with the conflicting file names if they want to install all portions into a single location.
This would also avoid any issues you'd otherwise run into if you want to maintain this scheme in an importer that doesn't have access to a list of files in a package directory, but is well capable for the checking the existence of a file.
Do you have a specific mechanism in mind? Regards, Martin
Note that there is no such thing as a "defining namespace package" -- namespace package contents are symmetrical peers.
With the PEP, a "defining package" becomes possible - at most one portion can define an __init__.py. I know that the current mechanisms don't support it, and it might not be useful in general, but now there is a clean way of doing it, so I wouldn't exclude it. Distribution-wise, all distributions relying on the defining package would need to require (or install_require, or depend on) it.
The above are also true for using only a '*' in .pkg files -- in that event there are no sys.path changes. (Frankly, I'm doubtful that anybody is using extend_path and .pkg files to begin with, so I'd be fine with a proposal that instead used something like '.nsp' files that didn't even need to be opened and read -- which would let the directory scan stop at the first .nsp file found.
That would work for me as well. Nobody at PyCon could remember where .pkg files came from.
I believe the PEP does this as well, IIUC.
Correct.
* It's possible to have a defining package dir and add-one package dirs.
Also possible in the PEP, although the __init__.py must be in the first such directory on sys.path.
I should make it clear that this is not the case. I envision it to work this way: import zope - searches sys.path, until finding either a directory zope, or a file zope.{py,pyc,pyd,...} - if it is a directory, it checks for .pkg files. If it finds any, it processes them, extending __path__. - it *then* checks for __init__.py, taking the first hit anywhere on __path__ (just like any module import would) - if no .pkg was found, nor an __init__.py, it proceeds with the next sys.path item (skipping the directory entirely) Regards, Martin
On 08:15 pm, martin@v.loewis.de wrote:
Note that there is no such thing as a "defining namespace package" -- namespace package contents are symmetrical peers.
With the PEP, a "defining package" becomes possible - at most one portion can define an __init__.py.
For what it's worth, this is a _super_ useful feature for Twisted. We have one "defining package" for the "twisted" package (twisted core) and then a bunch of other things which want to put things into twisted.* (twisted.web, twisted.conch, et. al.). For debian we already have separate packages, but such a definition of namespace packages would allow us to actually have things separated out on the cheeseshop as well.
At 10:15 PM 4/3/2009 +0200, Martin v. Löwis wrote:
I should make it clear that this is not the case. I envision it to work this way: import zope - searches sys.path, until finding either a directory zope, or a file zope.{py,pyc,pyd,...} - if it is a directory, it checks for .pkg files. If it finds any, it processes them, extending __path__. - it *then* checks for __init__.py, taking the first hit anywhere on __path__ (just like any module import would) - if no .pkg was found, nor an __init__.py, it proceeds with the next sys.path item (skipping the directory entirely)
Ah, I missed that. Maybe the above should be added to the PEP to clarify.
On Fri, Apr 3, 2009 at 13:15, "Martin v. Löwis" <martin@v.loewis.de> wrote:
Note that there is no such thing as a "defining namespace package" -- namespace package contents are symmetrical peers.
With the PEP, a "defining package" becomes possible - at most one portion can define an __init__.py.
I know that the current mechanisms don't support it, and it might not be useful in general, but now there is a clean way of doing it, so I wouldn't exclude it. Distribution-wise, all distributions relying on the defining package would need to require (or install_require, or depend on) it.
The above are also true for using only a '*' in .pkg files -- in that event there are no sys.path changes. (Frankly, I'm doubtful that anybody is using extend_path and .pkg files to begin with, so I'd be fine with a proposal that instead used something like '.nsp' files that didn't even need to be opened and read -- which would let the directory scan stop at the first .nsp file found.
That would work for me as well. Nobody at PyCon could remember where .pkg files came from.
I believe the PEP does this as well, IIUC.
Correct.
* It's possible to have a defining package dir and add-one package dirs.
Also possible in the PEP, although the __init__.py must be in the first such directory on sys.path.
I should make it clear that this is not the case. I envision it to work this way: import zope - searches sys.path, until finding either a directory zope, or a file zope.{py,pyc,pyd,...} - if it is a directory, it checks for .pkg files. If it finds any, it processes them, extending __path__. - it *then* checks for __init__.py, taking the first hit anywhere on __path__ (just like any module import would)
Just so people know how this __init__ search could be done such that __path__ is set from the .pkg is to treat it as a reload (assuming .pkg files can only be found off of sys.path). -Brett
- if no .pkg was found, nor an __init__.py, it proceeds with the next sys.path item (skipping the directory entirely)
Martin v. Löwis wrote:
Chris Withers wrote:
Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1. Please comment. Would this support the following case:
I have a package called mortar, which defines useful stuff:
from mortar import content, ...
I now want to distribute large optional chunks separately, but ideally so that the following will will work:
from mortar.rbd import ... from mortar.zodb import ... from mortar.wsgi import ...
Does the PEP support this?
That's the primary purpose of the PEP.
Are you sure? Does the pep really allow for: from mortar import content from mortar.rdb import something ...where 'content' is a function defined in mortar/__init__.py and 'something' is a function defined in mortar/rdb/__init__.py *and* the following are separate distributions on PyPI: - mortar - mortar.rdb ...where 'mortar' does not contain 'mortar.rdb'.
You can do this today already (see the zope package,
No, they have nothing but a (functionally) empty __init__.py in the zope package. cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1.
Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0. jesse
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Apr 6, 2009, at 9:21 AM, Jesse Noller wrote:
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1.
Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0.
Actually, isn't the policy just that nothing can go into 2.7 that isn't backported from 3.1? Whether the actual backport happens or not is up to the developer though. OTOH, we talked about a lot of things and my recollection is probably fuzzy. Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSdoDAXEjvBPtnXfVAQIrPgQAse7BXQfPYHJJ/g3HNEtc0UmZZ9MCNtGc sIoZ2EHRVz+pylZT9fmSmorJdIdFvAj7E43tKsV2bQpo/am9XlL10SMn3k0KLxnF vNCi39nB1B7Uktbnrlpnfo4u93suuEqYexEwrkDhJuTMeye0Cxg0os5aysryuPza mKr5jsqkV5c= =Y9iP -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Apr 6, 2009, at 9:21 AM, Jesse Noller wrote:
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1.
Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0.
Actually, isn't the policy just that nothing can go into 2.7 that isn't backported from 3.1? Whether the actual backport happens or not is up to the developer though. OTOH, we talked about a lot of things and my recollection is probably fuzzy.
I believe Barry is correct. The official policy is "no features in 2.7 that aren't also in 3.1". I personally think I'm not going to put anything else in 2.7, specifically the ',' formatter stuff from PEP 378. 3.1 has diverged too far from 2.7 in this regard to make the backport easy to do. But this decision is left up to the individual committer.
At 02:00 PM 4/6/2009 +0100, Chris Withers wrote:
Martin v. Löwis wrote:
Chris Withers wrote:
Would this support the following case:
I have a package called mortar, which defines useful stuff:
from mortar import content, ...
I now want to distribute large optional chunks separately, but ideally so that the following will will work:
from mortar.rbd import ... from mortar.zodb import ... from mortar.wsgi import ...
Does the PEP support this? That's the primary purpose of the PEP.
Are you sure?
Does the pep really allow for:
from mortar import content from mortar.rdb import something
...where 'content' is a function defined in mortar/__init__.py and 'something' is a function defined in mortar/rdb/__init__.py *and* the following are separate distributions on PyPI:
- mortar - mortar.rdb
...where 'mortar' does not contain 'mortar.rdb'.
See the third paragraph of http://www.python.org/dev/peps/pep-0382/#discussion
P.J. Eby wrote:
See the third paragraph of http://www.python.org/dev/peps/pep-0382/#discussion
Indeed, I guess the PEP could be made more explanatory then 'cos, as a packager, I don't see what I'd put in the various setup.py and __init__.py to make this work... That said, I'm delighted to hear it's going to be possible and wholeheartedly support the PEP and it's backporting to 2.7 as a result... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
On Mon, Apr 6, 2009 at 9:26 AM, Barry Warsaw <barry@python.org> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Apr 6, 2009, at 9:21 AM, Jesse Noller wrote:
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1.
Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0.
Actually, isn't the policy just that nothing can go into 2.7 that isn't backported from 3.1? Whether the actual backport happens or not is up to the developer though. OTOH, we talked about a lot of things and my recollection is probably fuzzy.
Barry
That *is* the official policy, but there was discussions around no further backporting of features from 3.1 into 2.x, therefore providing more of an upgrade incentive
On Mon, 6 Apr 2009 at 12:00, Jesse Noller wrote:
On Mon, Apr 6, 2009 at 9:26 AM, Barry Warsaw <barry@python.org> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Apr 6, 2009, at 9:21 AM, Jesse Noller wrote:
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. L�wis wrote:
I propose the following PEP for inclusion to Python 3.1.
Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0.
Actually, isn't the policy just that nothing can go into 2.7 that isn't backported from 3.1? �Whether the actual backport happens or not is up to the developer though. �OTOH, we talked about a lot of things and my recollection is probably fuzzy.
Barry
That *is* the official policy, but there was discussions around no further backporting of features from 3.1 into 2.x, therefore providing more of an upgrade incentive
My sense was that this wasn't proposed as a hard and fast rule, more as a strongly suggested guideline. And in this case, I think you could argue that the PEP is actually fixing a bug in the current namespace packaging system. Some projects, especially the large ones where this matters most, are going to have to maintain backward compatibility for 2.x for a long time even as 3.x adoption accelerates. It seems a shame to require packagers to continue to deal with the problems caused by the current system even after all the platforms have made it to 2.7+. --David
On Mon, Apr 6, 2009 at 12:28 PM, R. David Murray <rdmurray@bitdance.com> wrote:
On Mon, 6 Apr 2009 at 12:00, Jesse Noller wrote:
On Mon, Apr 6, 2009 at 9:26 AM, Barry Warsaw <barry@python.org> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Apr 6, 2009, at 9:21 AM, Jesse Noller wrote:
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1.
Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0.
Actually, isn't the policy just that nothing can go into 2.7 that isn't backported from 3.1? Whether the actual backport happens or not is up to the developer though. OTOH, we talked about a lot of things and my recollection is probably fuzzy.
Barry
That *is* the official policy, but there was discussions around no further backporting of features from 3.1 into 2.x, therefore providing more of an upgrade incentive
My sense was that this wasn't proposed as a hard and fast rule, more as a strongly suggested guideline.
And in this case, I think you could argue that the PEP is actually fixing a bug in the current namespace packaging system.
Some projects, especially the large ones where this matters most, are going to have to maintain backward compatibility for 2.x for a long time even as 3.x adoption accelerates. It seems a shame to require packagers to continue to deal with the problems caused by the current system even after all the platforms have made it to 2.7+.
--David
I know it wasn't a hard and fast rule; also, with 3to2 already being worked on, the barrier of maintenance and back porting is going to be lowered.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jesse Noller wrote:
On Mon, Apr 6, 2009 at 12:28 PM, R. David Murray <rdmurray@bitdance.com> wrote:
On Mon, 6 Apr 2009 at 12:00, Jesse Noller wrote:
On Mon, Apr 6, 2009 at 9:26 AM, Barry Warsaw <barry@python.org> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Apr 6, 2009, at 9:21 AM, Jesse Noller wrote:
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. Löwis wrote: > I propose the following PEP for inclusion to Python 3.1. Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0. Actually, isn't the policy just that nothing can go into 2.7 that isn't backported from 3.1? Whether the actual backport happens or not is up to the developer though. OTOH, we talked about a lot of things and my recollection is probably fuzzy.
Barry That *is* the official policy, but there was discussions around no further backporting of features from 3.1 into 2.x, therefore providing more of an upgrade incentive My sense was that this wasn't proposed as a hard and fast rule, more as a strongly suggested guideline.
And in this case, I think you could argue that the PEP is actually fixing a bug in the current namespace packaging system.
Some projects, especially the large ones where this matters most, are going to have to maintain backward compatibility for 2.x for a long time even as 3.x adoption accelerates. It seems a shame to require packagers to continue to deal with the problems caused by the current system even after all the platforms have made it to 2.7+.
--David
I know it wasn't a hard and fast rule; also, with 3to2 already being worked on, the barrier of maintenance and back porting is going to be lowered.
My understanding from the summit is that the only point in a 2.7 release at all is to lower the "speed bumps" which make porting from 2.x to 3.x hard for large codebases. In this case, having a consistent spelling for namespace packages between 2.7 and 3.1 would incent those applications / frameworks / libraries to move to 2.7, and therefore ease getting them to 3.1. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJ2jaR+gerLs4ltQ4RAsi1AJ0cJyKsoP5SlOcBlnzLr6MB11ZoNwCg1Kil 4O2M0sZG+jH12s22p2AmXWk= =DLRM -----END PGP SIGNATURE-----
On 2009-04-03 02:44, P.J. Eby wrote:
At 10:33 PM 4/2/2009 +0200, M.-A. Lemburg wrote:
Alternative Approach: ---------------------
Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ?
One of the namespace packages, the defining namespace package, will have to include a __init__.py file.
Note that there is no such thing as a "defining namespace package" -- namespace package contents are symmetrical peers.
That was a definition :-) Definition namespace package := the namespace package having the __pkg__.py file This is useful to have since packages allowing integration of other sub-packages typically come as a base package with some basic infra-structure in place which is required by all other namespace packages. If the __init__.py file is not found among the namespace directories, the importer will have to raise an exception, since the result would not be a proper Python package.
* It's possible to have a defining package dir and add-one package dirs.
Also possible in the PEP, although the __init__.py must be in the first such directory on sys.path. (However, such "defining" packages are not that common now, due to tool limitations.)
That's a strange limitation of the PEP. Why should the location of the __init__.py file depend on the order of sys.path ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 03 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On 2009-04-06 15:21, Jesse Noller wrote:
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1. Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0.
I must have missed that discussion :-) Where's the PEP pinning this down ? The Python 2.x user base is huge and the number of installed applications even larger. Cutting these users and application developers off of important new features added to Python 3 is only going to work as "carrot" for those developers who: * have enough resources (time, money, manpower) to port their existing application to Python 3 * can persuade their users to switch to Python 3 * don't rely much on 3rd party libraries (the bread and butter of Python applications) Realistically, such a porting effort is not likely going to happen for any decent sized application, except perhaps a few open source ones. Such a policy would then translate to a dead end for Python 2.x based applications. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 07 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
[Resent due to a python.org mail server problem] On 2009-04-03 22:07, Martin v. Löwis wrote:
I'd like to extend the proposal to Python 2.7 and later.
I don't object, but I also don't want to propose this, so I added it to the discussion.
My (and perhaps other people's) concern is that 2.7 might well be the last release of the 2.x series. If so, adding this feature to it would make 2.7 an odd special case for users and providers of third party tools.
I certainly hope that we'll see more useful features backported from 3.x to the 2.x series or forward ported from 2.x to 3.x (depending on what the core developer preferences are). Regarding this particular PEP, it is well possible to implement an importer that provides the functionality for Python 2.3-2.7 versions, so it doesn't have to be an odd special case.
That's going to slow down Python package detection a lot - you'd replace an O(1) test with an O(n) scan.
I question that claim. In traditional Unix systems, the file system driver performs a linear search of the directory, so it's rather O(n)-in-kernel vs. O(n)-in-Python. Even for advanced file systems, you need at least O(log n) to determine whether a specific file is in a directory. For all practical purposes, the package directory will fit in a single disk block (containing a single .pkg file, and one or few subpackages), making listdir complete as fast as stat.
On second thought, you're right, it won't be that costly. It requires an os.listdir() scan due to the wildcard approach and in some cases, such a scan may not be possible, e.g. when using frozen packages. Indeed, the freeze mechanism would not even add the .pkg files - it only handles .py file content. The same is true for distutils, MANIFEST generators and other installer mechanisms - it would have to learn to package the .pkg files along with the Python files. Another problem with the .pkg file approach is that the file extension is already in use for e.g. Mac OS X installers. You don't have those issues with the __pkg__.py file approach I suggested.
Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ?
Again - this wouldn't be O(1). More importantly, it breaks system packages, which now again have to deal with the conflicting file names if they want to install all portions into a single location.
True, but since that means changing the package infrastructure, I think it's fair to ask distributors who want to use that approach to also take care of looking into the __pkg__.py files and merging them if necessary. Most of the time the __pkg__.py files will be empty, so that's not really much to ask for.
This would also avoid any issues you'd otherwise run into if you want to maintain this scheme in an importer that doesn't have access to a list of files in a package directory, but is well capable for the checking the existence of a file.
Do you have a specific mechanism in mind?
Yes: frozen modules and imports straight from a web resource. The .pkg file approach requires a directory scan and additional support from all importers. The __pkg__.py approach I suggested can use existing importers without modifications by checking for the existence of such a Python module in an importer managed resource. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 07 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ?
Again - this wouldn't be O(1). More importantly, it breaks system packages, which now again have to deal with the conflicting file names if they want to install all portions into a single location.
True, but since that means changing the package infrastructure, I think it's fair to ask distributors who want to use that approach to also take care of looking into the __pkg__.py files and merging them if necessary.
Most of the time the __pkg__.py files will be empty, so that's not really much to ask for.
This means your proposal actually doesn't add any benefit over the status quo, where you can have an __init__.py that does nothing but declare the package a namespace. We already have that now, and it doesn't need a new filename. Why would we expect OS vendors to start supporting it, just because we name it __pkg__.py instead of __init__.py?
On 2009-04-07 16:05, P.J. Eby wrote:
At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ?
Again - this wouldn't be O(1). More importantly, it breaks system packages, which now again have to deal with the conflicting file names if they want to install all portions into a single location.
True, but since that means changing the package infrastructure, I think it's fair to ask distributors who want to use that approach to also take care of looking into the __pkg__.py files and merging them if necessary.
Most of the time the __pkg__.py files will be empty, so that's not really much to ask for.
This means your proposal actually doesn't add any benefit over the status quo, where you can have an __init__.py that does nothing but declare the package a namespace. We already have that now, and it doesn't need a new filename. Why would we expect OS vendors to start supporting it, just because we name it __pkg__.py instead of __init__.py?
I lost you there. Since when do we support namespace packages in core Python without the need to add some form of magic support code to __init__.py ? My suggestion basically builds on the same idea as Martin's PEP, but uses a single __pkg__.py file as opposed to some non-Python file yaddayadda.pkg. Here's a copy of the proposal, with some additional discussion bullets added: """ Alternative Approach: --------------------- Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ? This would also avoid any issues you'd otherwise run into if you want to maintain this scheme in an importer that doesn't have access to a list of files in a package directory, but is well capable for the checking the existence of a file. Mechanism: ---------- If the import mechanism finds a matching namespace package (a directory with a __pkg__.py file), it then goes into namespace package scan mode and scans the complete sys.path for more occurrences of the same namespace package. The import loads all __pkg__.py files of matching namespace packages having the same package name during the search. One of the namespace packages, the defining namespace package, will have to include a __init__.py file. After having scanned all matching namespace packages and loading the __pkg__.py files in the order of the search, the import mechanism then sets the packages .__path__ attribute to include all namespace package directories found on sys.path and finally executes the __init__.py file. (Please let me know if the above is not clear, I will then try to follow up on it.) Discussion: ----------- The above mechanism allows the same kind of flexibility we already have with the existing normal __init__.py mechanism. * It doesn't add yet another .pth-style sys.path extension (which are difficult to manage in installations). * It always uses the same naive sys.path search strategy. The strategy is not determined by some file contents. * The search is only done once - on the first import of the package. * It's possible to have a defining package dir and add-one package dirs. * The search does not depend on the order of directories in sys.path. There's no requirement for the defining package to appear first on sys.path. * Namespace packages are easy to recognize by testing for a single resource. * There's no conflict with existing files using the .pkg extension such as Mac OS X installer files or Solaris packages. * Namespace __pkg__.py modules can provide extra meta-information, logging, etc. to simplify debugging namespace package setups. * It's possible to freeze such setups, to put them into ZIP files, or only have parts of it in a ZIP file and the other parts in the file-system. * There's no need for a package directory scan, allowing the mechanism to also work with resources that do not permit to (easily and efficiently) scan the contents of a package "directory", e.g. frozen packages or imports from web resources. Caveats: * Changes to sys.path will not result in an automatic rescan for additional namespace packages, if the package was already loaded. However, we could have a function to make such a rescan explicit. """ -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 07 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Tue, Apr 7, 2009 at 11:58 PM, M.-A. Lemburg <mal@egenix.com> wrote:
This means your proposal actually doesn't add any benefit over the status quo, where you can have an __init__.py that does nothing but declare the package a namespace. We already have that now, and it doesn't need a new filename. Why would we expect OS vendors to start supporting it, just because we name it __pkg__.py instead of __init__.py?
I lost you there.
Since when do we support namespace packages in core Python without the need to add some form of magic support code to __init__.py ?
I think P. Eby refers to the problem that most packaging systems don't like several packages to have the same file - be it empty or not. That's my main personal grip against namespace packages, and from this POV, I think it is fair to say the proposal does not solve anything. Not that I have a solution, of course :) cheers, David
My suggestion basically builds on the same idea as Martin's PEP, but uses a single __pkg__.py file as opposed to some non-Python file yaddayadda.pkg.
Here's a copy of the proposal, with some additional discussion bullets added:
""" Alternative Approach: ---------------------
Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ?
This would also avoid any issues you'd otherwise run into if you want to maintain this scheme in an importer that doesn't have access to a list of files in a package directory, but is well capable for the checking the existence of a file.
Mechanism: ----------
If the import mechanism finds a matching namespace package (a directory with a __pkg__.py file), it then goes into namespace package scan mode and scans the complete sys.path for more occurrences of the same namespace package.
The import loads all __pkg__.py files of matching namespace packages having the same package name during the search.
One of the namespace packages, the defining namespace package, will have to include a __init__.py file.
After having scanned all matching namespace packages and loading the __pkg__.py files in the order of the search, the import mechanism then sets the packages .__path__ attribute to include all namespace package directories found on sys.path and finally executes the __init__.py file.
(Please let me know if the above is not clear, I will then try to follow up on it.)
Discussion: -----------
The above mechanism allows the same kind of flexibility we already have with the existing normal __init__.py mechanism.
* It doesn't add yet another .pth-style sys.path extension (which are difficult to manage in installations).
* It always uses the same naive sys.path search strategy. The strategy is not determined by some file contents.
* The search is only done once - on the first import of the package.
* It's possible to have a defining package dir and add-one package dirs.
* The search does not depend on the order of directories in sys.path. There's no requirement for the defining package to appear first on sys.path.
* Namespace packages are easy to recognize by testing for a single resource.
* There's no conflict with existing files using the .pkg extension such as Mac OS X installer files or Solaris packages.
* Namespace __pkg__.py modules can provide extra meta-information, logging, etc. to simplify debugging namespace package setups.
* It's possible to freeze such setups, to put them into ZIP files, or only have parts of it in a ZIP file and the other parts in the file-system.
* There's no need for a package directory scan, allowing the mechanism to also work with resources that do not permit to (easily and efficiently) scan the contents of a package "directory", e.g. frozen packages or imports from web resources.
Caveats:
* Changes to sys.path will not result in an automatic rescan for additional namespace packages, if the package was already loaded. However, we could have a function to make such a rescan explicit. """
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Apr 07 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/cournape%40gmail.com
On Tue, Apr 7, 2009 at 5:25 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-06 15:21, Jesse Noller wrote:
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1. Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0.
I must have missed that discussion :-)
Where's the PEP pinning this down ?
The Python 2.x user base is huge and the number of installed applications even larger.
Cutting these users and application developers off of important new features added to Python 3 is only going to work as "carrot" for those developers who:
* have enough resources (time, money, manpower) to port their existing application to Python 3
* can persuade their users to switch to Python 3
* don't rely much on 3rd party libraries (the bread and butter of Python applications)
Realistically, such a porting effort is not likely going to happen for any decent sized application, except perhaps a few open source ones.
Such a policy would then translate to a dead end for Python 2.x based applications.
Think of the advantages though! Python 2 will finally become *stable*. The group of users you are talking to are usually balking at the thought of upgrading from 2.x to 2.(x+1) just as much as they might balk at the thought of Py3k. We're finally giving them what they really want. Regarding calling this a dead end, we're committed to supporting 2.x for at least five years. If that's not enough, well, it's open source, so there's no reason why some group of rogue 2.x fans can't maintain it indefinitely after that. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
At 04:58 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
On 2009-04-07 16:05, P.J. Eby wrote:
At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ?
Again - this wouldn't be O(1). More importantly, it breaks system packages, which now again have to deal with the conflicting file names if they want to install all portions into a single location.
True, but since that means changing the package infrastructure, I think it's fair to ask distributors who want to use that approach to also take care of looking into the __pkg__.py files and merging them if necessary.
Most of the time the __pkg__.py files will be empty, so that's not really much to ask for.
This means your proposal actually doesn't add any benefit over the status quo, where you can have an __init__.py that does nothing but declare the package a namespace. We already have that now, and it doesn't need a new filename. Why would we expect OS vendors to start supporting it, just because we name it __pkg__.py instead of __init__.py?
I lost you there.
Since when do we support namespace packages in core Python without the need to add some form of magic support code to __init__.py ?
My suggestion basically builds on the same idea as Martin's PEP, but uses a single __pkg__.py file as opposed to some non-Python file yaddayadda.pkg.
Right... which completely obliterates the primary benefit of the original proposal compared to the status quo. That is, that the PEP 382 way is more compatible with system packaging tools. Without that benefit, there's zero gain in your proposal over having __init__.py files just call pkgutil.extend_path() (in the stdlib since 2.3, btw) or pkg_resources.declare_namespace() (similar functionality, but with zipfile support and some other niceties). IOW, your proposal doesn't actually improve the status quo in any way that I am able to determine, except that it calls for loading all the __pkg__.py modules, rather than just the first one. (And the setuptools implementation of namespace packages actually *does* load multiple __init__.py's, so that's still no change over the status quo for setuptools-using packages.)
Martin v. Löwis wrote:
Such a policy would then translate to a dead end for Python 2.x based applications.
2.x based applications *are* in a dead end, with the only exit being portage to 3.x.
The actual end of the dead end just happens to be in 2013 or so :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Thu, Apr 09, 2009, Nick Coghlan wrote:
Martin v. L?wis wrote:
Such a policy would then translate to a dead end for Python 2.x based applications.
2.x based applications *are* in a dead end, with the only exit being portage to 3.x.
The actual end of the dead end just happens to be in 2013 or so :)
More like 2016 or 2020 -- as of January, my former employer was still using Python 2.3, and I wouldn't be surprised if 1.5.2 was still out in the wilds. The transition to 3.x is more extreme, and lots of people will continue making do for years after any formal support is dropped. Whether this warrants including PEP 382 in 2.x, I don't know; I still don't really understand this proposal. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Why is this newsgroup different from all other newsgroups?
Aahz wrote:
On Thu, Apr 09, 2009, Nick Coghlan wrote:
Martin v. L?wis wrote:
Such a policy would then translate to a dead end for Python 2.x based applications. 2.x based applications *are* in a dead end, with the only exit being portage to 3.x. The actual end of the dead end just happens to be in 2013 or so :)
More like 2016 or 2020 -- as of January, my former employer was still using Python 2.3, and I wouldn't be surprised if 1.5.2 was still out in the wilds.
Indeed - I know of a system that will finally be migrating from Python 2.2 to Python *2.4* later this year :)
The transition to 3.x is more extreme, and lots of people will continue making do for years after any formal support is dropped.
Yeah, I was only referring to the likely minimum time frame that python-dev would continue providing security releases. As you say, the actual 2.x version of the language will live on long after the day we close all remaining 2.x only bug reports and patches as "out of date".
Whether this warrants including PEP 382 in 2.x, I don't know; I still don't really understand this proposal.
I'd personally still prefer to keep the guideline that new features that are easy to backport *should* be backported, but that's really a decision for the authors of each new feature. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Thu, Apr 9, 2009 at 5:53 AM, Aahz <aahz@pythoncraft.com> wrote:
On Thu, Apr 09, 2009, Nick Coghlan wrote:
Martin v. L?wis wrote:
Such a policy would then translate to a dead end for Python 2.x based applications.
2.x based applications *are* in a dead end, with the only exit being portage to 3.x.
The actual end of the dead end just happens to be in 2013 or so :)
More like 2016 or 2020 -- as of January, my former employer was still using Python 2.3, and I wouldn't be surprised if 1.5.2 was still out in the wilds. The transition to 3.x is more extreme, and lots of people will continue making do for years after any formal support is dropped.
There's nothing wrong with that. People using 1.5.2 today certainly aren't asking for support, and people using 2.3 probably aren't expecting much either. That's fine, those Python versions are as stable as the rest of their environment. (I betcha they're still using GCC 2.96 too, though they probably don't have any reason to build a new Python binary from source. :-) People *will* be using 2.6 well past 2013. But will they care about the Python community actively supporting it? Of course not! Anything we did would probably break something for them. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
On 2009-04-07 19:46, P.J. Eby wrote:
At 04:58 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ?
Again - this wouldn't be O(1). More importantly, it breaks system packages, which now again have to deal with the conflicting file names if they want to install all portions into a single location.
True, but since that means changing the package infrastructure, I
On 2009-04-07 16:05, P.J. Eby wrote: think
it's fair to ask distributors who want to use that approach to also take care of looking into the __pkg__.py files and merging them if necessary.
Most of the time the __pkg__.py files will be empty, so that's not really much to ask for.
This means your proposal actually doesn't add any benefit over the status quo, where you can have an __init__.py that does nothing but declare the package a namespace. We already have that now, and it doesn't need a new filename. Why would we expect OS vendors to start supporting it, just because we name it __pkg__.py instead of __init__.py?
I lost you there.
Since when do we support namespace packages in core Python without the need to add some form of magic support code to __init__.py ?
My suggestion basically builds on the same idea as Martin's PEP, but uses a single __pkg__.py file as opposed to some non-Python file yaddayadda.pkg.
Right... which completely obliterates the primary benefit of the original proposal compared to the status quo. That is, that the PEP 382 way is more compatible with system packaging tools.
Without that benefit, there's zero gain in your proposal over having __init__.py files just call pkgutil.extend_path() (in the stdlib since 2.3, btw) or pkg_resources.declare_namespace() (similar functionality, but with zipfile support and some other niceties).
IOW, your proposal doesn't actually improve the status quo in any way that I am able to determine, except that it calls for loading all the __pkg__.py modules, rather than just the first one. (And the setuptools implementation of namespace packages actually *does* load multiple __init__.py's, so that's still no change over the status quo for setuptools-using packages.)
The purpose of the PEP is to create a standard for namespace packages. That's orthogonal to trying to enhance or change some existing techniques. I don't see the emphasis in the PEP on Linux distribution support and the remote possibility of them wanting to combine separate packages back into one package as good argument for adding yet another separate hierarchy of special files which Python scans during imports. That said, note that most distributions actually take the other route: they try to split up larger packages into smaller ones, so the argument becomes even weaker. It is much more important to standardize the approach than to try to extend some existing trickery and make them even more opaque than they already are by introducing yet another level of complexity. My alternative approach builds on existing methods and fits nicely with the __init__.py approach Python has already been using for more than a decade now. It's transparent, easy to understand and provides enough functionality to build upon - much like the original __init__.py idea. I've already laid out the arguments for and against it in my previous reply, so won't repeat them here. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 14 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On 2009-04-07 18:19, Guido van Rossum wrote:
On Tue, Apr 7, 2009 at 5:25 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-06 15:21, Jesse Noller wrote:
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2009-04-02 17:32, Martin v. Löwis wrote:
I propose the following PEP for inclusion to Python 3.1. Thanks for picking this up.
I'd like to extend the proposal to Python 2.7 and later.
-1 to adding it to the 2.x series. There was much discussion around adding features to 2.x *and* 3.0, and the consensus seemed to *not* add new features to 2.x and use those new features as carrots to help lead people into 3.0. I must have missed that discussion :-)
Where's the PEP pinning this down ?
The Python 2.x user base is huge and the number of installed applications even larger.
Cutting these users and application developers off of important new features added to Python 3 is only going to work as "carrot" for those developers who:
* have enough resources (time, money, manpower) to port their existing application to Python 3
* can persuade their users to switch to Python 3
* don't rely much on 3rd party libraries (the bread and butter of Python applications)
Realistically, such a porting effort is not likely going to happen for any decent sized application, except perhaps a few open source ones.
Such a policy would then translate to a dead end for Python 2.x based applications.
Think of the advantages though! Python 2 will finally become *stable*. The group of users you are talking to are usually balking at the thought of upgrading from 2.x to 2.(x+1) just as much as they might balk at the thought of Py3k. We're finally giving them what they really want.
Python 2.x is stable - much more than 3.x is today. However, stable does not mean zero development, which a "No new features in Python 2.x" policy would translate to. If there are core developers that care about 2.x, then it should be possible for them to add the necessary patches to future 2.x releases.
Regarding calling this a dead end, we're committed to supporting 2.x for at least five years. If that's not enough, well, it's open source, so there's no reason why some group of rogue 2.x fans can't maintain it indefinitely after that.
Sure, but why can't this be done within the existing Python developer community ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 14 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
At 05:02 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
I don't see the emphasis in the PEP on Linux distribution support and the remote possibility of them wanting to combine separate packages back into one package as good argument for adding yet another separate hierarchy of special files which Python scans during imports.
That said, note that most distributions actually take the other route: they try to split up larger packages into smaller ones, so the argument becomes even weaker.
I think you've misunderstood something about the use case. System packaging tools don't like separate packages to contain the *same file*. That means that they *can't* split a larger package up with your proposal, because every one of those packages would have to contain a __pkg__.py -- and thus be in conflict with each other. Either that, or they would have to make a separate system package containing *only* the __pkg__.py, and then make all packages using the namespace depend on it -- which is more work and requires greater co-ordination among packagers. Allowing each system package to contain its own .pkg or .nsp or whatever files, on the other hand, allows each system package to be built independently, without conflict between contents (i.e., having the same file), and without requiring a special pseudo-package to contain the additional file. Also, executing multiple __pkg__.py files means that when multiple system packages are installed to site-packages, only one of them could possibly be executed. (Note that, even though the system packages themselves are not "combined", in practice they will all be installed to the same directory, i.e., site-packages or the platform equivalent thereof.)
On 2009-04-14 18:27, P.J. Eby wrote:
At 05:02 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
I don't see the emphasis in the PEP on Linux distribution support and the remote possibility of them wanting to combine separate packages back into one package as good argument for adding yet another separate hierarchy of special files which Python scans during imports.
That said, note that most distributions actually take the other route: they try to split up larger packages into smaller ones, so the argument becomes even weaker.
I think you've misunderstood something about the use case. System packaging tools don't like separate packages to contain the *same file*. That means that they *can't* split a larger package up with your proposal, because every one of those packages would have to contain a __pkg__.py -- and thus be in conflict with each other. Either that, or they would have to make a separate system package containing *only* the __pkg__.py, and then make all packages using the namespace depend on it -- which is more work and requires greater co-ordination among packagers.
You are missing the point: When breaking up a large package that lives in site-packages into smaller distribution bundles, you don't need namespace packages at all, so the PEP doesn't apply. The way this works is by having a base distribution bundle that includes the needed __init__.py file and a set of extension bundles the add other files to the same directory (without including another copy of __init__.py). The extension bundles include a dependency on the base package to make sure that it always gets installed first. Debian has been using that approach for egenix-mx-base for years. Works great: http://packages.debian.org/source/lenny/egenix-mx-base eGenix has been using that approach for mx package add-ons as well - long before "namespace" packages where given that name :-) Please note that the PEP is about providing ways to have package parts live on sys.path that reintegrate themselves into a single package at import time. As such it's targeting Python developers that want to ship add-ons to existing packages, not Linux distributions (they usually have their own ideas about what goes where - something that's completely out-of- scope for the PEP). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 14 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
You are missing the point: When breaking up a large package that lives in site-packages into smaller distribution bundles, you don't need namespace packages at all, so the PEP doesn't apply.
The way this works is by having a base distribution bundle that includes the needed __init__.py file and a set of extension bundles the add other files to the same directory (without including another copy of __init__.py). The extension bundles include a dependency on the base package to make sure that it always gets installed first.
If we're going to keep that practice, there's no point to having the PEP: all three methods (base+extensions, pkgutil, setuptools) all work just fine as they are, with no changes to importing or the stdlib. In particular, without the feature of being able to drop that practice, there would be no reason for setuptools to adopt the PEP. That's why I'm -1 on your proposal: it's actually inferior to the methods we already have today.
On 2009-04-15 02:32, P.J. Eby wrote:
At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
You are missing the point: When breaking up a large package that lives in site-packages into smaller distribution bundles, you don't need namespace packages at all, so the PEP doesn't apply.
The way this works is by having a base distribution bundle that includes the needed __init__.py file and a set of extension bundles the add other files to the same directory (without including another copy of __init__.py). The extension bundles include a dependency on the base package to make sure that it always gets installed first.
If we're going to keep that practice, there's no point to having the PEP: all three methods (base+extensions, pkgutil, setuptools) all work just fine as they are, with no changes to importing or the stdlib.
Again: the PEP is about creating a standard for namespace packages. It's not about making namespace packages easy to use for Linux distribution maintainers. Instead, it's targeting *developers* that want to enable shipping a single package in multiple, separate pieces, giving the user the freedom to the select the ones she needs. Of course, this is possible today using various other techniques. The point is that there is no standard for namespace packages and that's what the PEP is trying to solve.
In particular, without the feature of being able to drop that practice, there would be no reason for setuptools to adopt the PEP. That's why I'm -1 on your proposal: it's actually inferior to the methods we already have today.
It's simpler and more in line with the Python Zen, not inferior. You are free not to support it in setuptools - the methods implemented in setuptools will continue to work as they are, but continue to require support code and, over time, no longer be compatible with other tools building upon the standard defined in the PEP. In the end, it's the user that decides: whether to go with a standard or not. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 15 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
At 09:51 AM 4/15/2009 +0200, M.-A. Lemburg wrote:
On 2009-04-15 02:32, P.J. Eby wrote:
At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
You are missing the point: When breaking up a large package that lives in site-packages into smaller distribution bundles, you don't need namespace packages at all, so the PEP doesn't apply.
The way this works is by having a base distribution bundle that includes the needed __init__.py file and a set of extension bundles the add other files to the same directory (without including another copy of __init__.py). The extension bundles include a dependency on the base package to make sure that it always gets installed first.
If we're going to keep that practice, there's no point to having the PEP: all three methods (base+extensions, pkgutil, setuptools) all work just fine as they are, with no changes to importing or the stdlib.
Again: the PEP is about creating a standard for namespace packages. It's not about making namespace packages easy to use for Linux distribution maintainers. Instead, it's targeting *developers* that want to enable shipping a single package in multiple, separate pieces, giving the user the freedom to the select the ones she needs.
Of course, this is possible today using various other techniques. The point is that there is no standard for namespace packages and that's what the PEP is trying to solve.
In particular, without the feature of being able to drop that practice, there would be no reason for setuptools to adopt the PEP. That's why I'm -1 on your proposal: it's actually inferior to the methods we already have today.
It's simpler and more in line with the Python Zen, not inferior.
You are free not to support it in setuptools - the methods implemented in setuptools will continue to work as they are, but continue to require support code and, over time, no longer be compatible with other tools building upon the standard defined in the PEP.
In the end, it's the user that decides: whether to go with a standard or not.
Up until this point, I've been trying to help you understand the use cases, but it's clear now that you already understand them, you just don't care. That wouldn't be a problem if you just stayed on the sidelines, instead of actively working to make those use cases more difficult for everyone else than they already are. Anyway, since you clearly understand precisely what you're doing, I'm now going to stop trying to explain things, as my responses are apparently just encouraging you, and possibly convincing bystanders that there's some genuine controversy here as well.
[much quote-trimming, the following is intended to just give the gist, but the bits quoted below are not in directe response to each other] On Wed, Apr 15, 2009, P.J. Eby wrote:
At 09:51 AM 4/15/2009 +0200, M.-A. Lemburg wrote:
[...] Again: the PEP is about creating a standard for namespace packages. It's not about making namespace packages easy to use for Linux distribution maintainers. Instead, it's targeting *developers* that want to enable shipping a single package in multiple, separate pieces, giving the user the freedom to the select the ones she needs. [...]
[...] Anyway, since you clearly understand precisely what you're doing, I'm now going to stop trying to explain things, as my responses are apparently just encouraging you, and possibly convincing bystanders that there's some genuine controversy here as well.
For the benefit of us bystanders, could you summarize your vote at this point? Given the PEP's intended goals, if you do not oppose the PEP, are there any changes you think should be made? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Why is this newsgroup different from all other newsgroups?
On 2009-04-15 16:44, P.J. Eby wrote:
At 09:51 AM 4/15/2009 +0200, M.-A. Lemburg wrote:
At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
You are missing the point: When breaking up a large package that
On 2009-04-15 02:32, P.J. Eby wrote: lives in
site-packages into smaller distribution bundles, you don't need namespace packages at all, so the PEP doesn't apply.
The way this works is by having a base distribution bundle that includes the needed __init__.py file and a set of extension bundles the add other files to the same directory (without including another copy of __init__.py). The extension bundles include a dependency on the base package to make sure that it always gets installed first.
If we're going to keep that practice, there's no point to having the PEP: all three methods (base+extensions, pkgutil, setuptools) all work just fine as they are, with no changes to importing or the stdlib.
Again: the PEP is about creating a standard for namespace packages. It's not about making namespace packages easy to use for Linux distribution maintainers. Instead, it's targeting *developers* that want to enable shipping a single package in multiple, separate pieces, giving the user the freedom to the select the ones she needs.
Of course, this is possible today using various other techniques. The point is that there is no standard for namespace packages and that's what the PEP is trying to solve.
In particular, without the feature of being able to drop that practice, there would be no reason for setuptools to adopt the PEP. That's why I'm -1 on your proposal: it's actually inferior to the methods we already have today.
It's simpler and more in line with the Python Zen, not inferior.
You are free not to support it in setuptools - the methods implemented in setuptools will continue to work as they are, but continue to require support code and, over time, no longer be compatible with other tools building upon the standard defined in the PEP.
In the end, it's the user that decides: whether to go with a standard or not.
Up until this point, I've been trying to help you understand the use cases, but it's clear now that you already understand them, you just don't care.
That wouldn't be a problem if you just stayed on the sidelines, instead of actively working to make those use cases more difficult for everyone else than they already are.
Anyway, since you clearly understand precisely what you're doing, I'm now going to stop trying to explain things, as my responses are apparently just encouraging you, and possibly convincing bystanders that there's some genuine controversy here as well.
Hopefully, bystanders will understand that the one single use case you are always emphasizing, namely that of Linux distribution maintainers trying to change the package installation layout, is really a rather uncommon and rare use case. It is true that I do understand what the namespace package idea is all about. I've been active in Python package development since they were first added to Python as a new built-in import feature in Python 1.5 and have been distributing packages with package add-ons for more than a decade... For some history, have a look at: http://www.python.org/doc/essays/packages.html Also note how that essay discourages the use of .pth files: """ If the package really requires adding one or more directories on sys.path (e.g. because it has not yet been structured to support dotted-name import), a "path configuration file" named package.pth can be placed in either the site-python or site-packages directory. ... A typical installation should have no or very few .pth files or something is wrong, and if you need to play with the search order, something is very wrong. """ Back to the PEP: The much more common use case is that of wanting to have a base package installation which optional add-ons that live in the same logical package namespace. The PEP provides a way to solve this use case by giving both developers and users a standard at hand which they can follow without having to rely on some non-standard helpers and across Python implementations. My proposal tries to solve this without adding yet another .pth file like mechanism - hopefully in the spirit of the original Python package idea. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 15 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Apr 15, 2009, at 12:15 PM, M.-A. Lemburg wrote:
The much more common use case is that of wanting to have a base package installation which optional add-ons that live in the same logical package namespace.
The PEP provides a way to solve this use case by giving both developers and users a standard at hand which they can follow without having to rely on some non-standard helpers and across Python implementations.
I'm not sure I understand what advantage your proposal gives over the current mechanism for doing this. That is, add to your __init__.py file: from pkgutil import extend_path __path__ = extend_path(__path__, __name__) Can you describe the intended advantages over the status-quo a bit more clearly? James
At 09:10 AM 4/15/2009 -0700, Aahz wrote:
For the benefit of us bystanders, could you summarize your vote at this point? Given the PEP's intended goals, if you do not oppose the PEP, are there any changes you think should be made?
I'm +1 on Martin's original version of the PEP, subject to the point brought up by someone that .pkg should be changed to a different extension. I'm -1 on all of MAL's proposed revisions, as IMO they are a step backwards: they "standardize" an approach that will create problems that don't need to exist, and don't exist now. Martin's proposal is an improvement on the status quo, Marc's proposal is a dis-improvement.
At 06:15 PM 4/15/2009 +0200, M.-A. Lemburg wrote:
The much more common use case is that of wanting to have a base package installation which optional add-ons that live in the same logical package namespace.
Please see the large number of Zope and PEAK distributions on PyPI as minimal examples that disprove this being the common use case. I expect you will find a fair number of others, as well. In these cases, there is NO "base package"... the entire point of using namespace packages for these distributions is that a "base package" is neither necessary nor desirable. In other words, the "base package" scenario is the exception these days, not the rule. I actually know specifically of only one other such package besides your mx.* case, the logilab ll.* package.
On 2009-04-15 19:38, James Y Knight wrote:
On Apr 15, 2009, at 12:15 PM, M.-A. Lemburg wrote:
The much more common use case is that of wanting to have a base package installation which optional add-ons that live in the same logical package namespace.
The PEP provides a way to solve this use case by giving both developers and users a standard at hand which they can follow without having to rely on some non-standard helpers and across Python implementations.
I'm not sure I understand what advantage your proposal gives over the current mechanism for doing this.
That is, add to your __init__.py file:
from pkgutil import extend_path __path__ = extend_path(__path__, __name__)
Can you describe the intended advantages over the status-quo a bit more clearly?
Simple: you don't need the above lines in your __init__.py file anymore and can rely on a Python standard for namespace packages instead of some helper implementation. The fact that you have a __pkg__.py file in your package dir will signal the namespace package character to Python's importer and this will take care of the lookup process for you. Namespace packages will be just as easy to write, install and maintain as regular Python packages. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 15 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On 2009-04-15 19:59, P.J. Eby wrote:
At 06:15 PM 4/15/2009 +0200, M.-A. Lemburg wrote:
The much more common use case is that of wanting to have a base package installation which optional add-ons that live in the same logical package namespace.
Please see the large number of Zope and PEAK distributions on PyPI as minimal examples that disprove this being the common use case. I expect you will find a fair number of others, as well.
In these cases, there is NO "base package"... the entire point of using namespace packages for these distributions is that a "base package" is neither necessary nor desirable.
In other words, the "base package" scenario is the exception these days, not the rule. I actually know specifically of only one other such package besides your mx.* case, the logilab ll.* package.
So now you're arguing against having base packages... at least you've dropped the strange idea of using Linux distribution maintainers as central use case ;-) Think of base namespace packages (the ones providing the __init__.py file) as defining the namespace. They setup ownership and the basic infrastructure needed by add-ons. If you take Zope as example, the Products/ package dir is a good example: the __init__.py file in that directory is provided by the Zope installation (generated during Zope instance creation), so Zope "owns" the package. With the proposal, Zope could declare this package dir a namespace base package by adding a __pkg__.py file to it. Zope add-ons could then be installed somewhere else on sys.path and include a Products/ dir as well, only this time it doesn't have the __init__.py file, but only a __pkg__.py file. Python would then take care of integrating the add-on Products/ dir Python module/package contents with the base package. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 15 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Wed, Apr 15, 2009 at 01:59:34PM -0400, P.J. Eby wrote:
Please see the large number of Zope and PEAK distributions on PyPI as minimal examples that disprove this being the common use case. I expect you will find a fair number of others, as well. ... In other words, the "base package" scenario is the exception these days, not the rule. I actually know specifically of only one other such package besides your mx.* case, the logilab ll.* package.
Isn't that pretty even, then? zope.* and PEAK are two examples of one approach; and mx.* and ll.* are two examples that use the base package approach. Neither approach seems to be the more common one, and both are pretty rare. --amk
At 02:52 PM 4/15/2009 -0400, A.M. Kuchling wrote:
On Wed, Apr 15, 2009 at 01:59:34PM -0400, P.J. Eby wrote:
Please see the large number of Zope and PEAK distributions on PyPI as minimal examples that disprove this being the common use case. I expect you will find a fair number of others, as well. ... In other words, the "base package" scenario is the exception these days, not the rule. I actually know specifically of only one other such package besides your mx.* case, the logilab ll.* package.
Isn't that pretty even, then? zope.* and PEAK are two examples of one approach; and mx.* and ll.* are two examples that use the base package approach. Neither approach seems to be the more common one, and both are pretty rare.
If you view the package listings on PyPI, you'll see that the "pure" namespaces currently in use include: alchemist.* amplecode.* atomisator.* bda.* benri.* beyondskins.* bliptv.* bopen.* borg.* bud.* ... This is just going down to the 'b's, looking only at packages whose PyPI project name reflects a nested package name, and only including those with entries that: 1. use setuptools, 2. declare one or more namespace packages, and 3. do not depend on some sort of "base" or "core" package. Technically, setuptools doesn't support base packages anyway, but if the organization appeared to be based on a "core+plugins/addons" model (as opposed to "collection of packages grouped in a namespace") I didn't include it in the list above -- i.e., I'm bending over backwards to be fair in the count. If somebody wants to do a formal count of base vs. pure, it might provide interesting stats. I initially only mentioned Zope and PEAK because I have direct knowledge of the developers' intent regarding their namespace packages. However, now that I've actually looked at a tiny sample of PyPI, it's clear that the actual field use of pure namespace packages has positively exploded since setuptools made it practical to use them. It's unclear, however, who is using base packages besides mx.* and ll.*, although I'd guess from the PyPI listings that perhaps Django is. (It seems that "base" packages are more likely to use a 'base-extension' naming pattern, vs. the 'namespace.project' pattern used by "pure" packages.) Of course, I am certainly not opposed to supporting base packages, and Martin's version of PEP 382 is a plus for setuptools because it would allow setuptools to better support the "base" scenario. But pure packages are definitely not a minority; in fact, a superficial observation of the full PyPI list suggests that there may be almost as many projects using pure-namespace packages, as there are non-namespaced projects!
On Wed, Apr 15, 2009 at 9:22 PM, P.J. Eby <pje@telecommunity.com> wrote:
At 02:52 PM 4/15/2009 -0400, A.M. Kuchling wrote:
On Wed, Apr 15, 2009 at 01:59:34PM -0400, P.J. Eby wrote:
Please see the large number of Zope and PEAK distributions on PyPI as minimal examples that disprove this being the common use case. I expect you will find a fair number of others, as well. ... In other words, the "base package" scenario is the exception these days, not the rule. I actually know specifically of only one other such package besides your mx.* case, the logilab ll.* package.
Isn't that pretty even, then? zope.* and PEAK are two examples of one approach; and mx.* and ll.* are two examples that use the base package approach. Neither approach seems to be the more common one, and both are pretty rare.
If you view the package listings on PyPI, you'll see that the "pure" namespaces currently in use include:
alchemist.* amplecode.* atomisator.* bda.* benri.* beyondskins.* bliptv.* bopen.* borg.* bud.* ...
This is just going down to the 'b's, looking only at packages whose PyPI project name reflects a nested package name, and only including those with entries that:
1. use setuptools, 2. declare one or more namespace packages, and 3. do not depend on some sort of "base" or "core" package.
Technically, setuptools doesn't support base packages anyway, but if the organization appeared to be based on a "core+plugins/addons" model (as opposed to "collection of packages grouped in a namespace") I didn't include it in the list above -- i.e., I'm bending over backwards to be fair in the count.
If somebody wants to do a formal count of base vs. pure, it might provide interesting stats. I initially only mentioned Zope and PEAK because I have direct knowledge of the developers' intent regarding their namespace packages.
However, now that I've actually looked at a tiny sample of PyPI, it's clear that the actual field use of pure namespace packages has positively exploded since setuptools made it practical to use them.
It's unclear, however, who is using base packages besides mx.* and ll.*, although I'd guess from the PyPI listings that perhaps Django is. (It seems that "base" packages are more likely to use a 'base-extension' naming pattern, vs. the 'namespace.project' pattern used by "pure" packages.)
Of course, I am certainly not opposed to supporting base packages, and Martin's version of PEP 382 is a plus for setuptools because it would allow setuptools to better support the "base" scenario.
But pure packages are definitely not a minority; in fact, a superficial observation of the full PyPI list suggests that there may be almost as many projects using pure-namespace packages, as there are non-namespaced projects!
In the survey I have done on packaging, 34% of the people that answered are using setuptools namespace feature, which currently makes it impossible to use the namespace for the base package. Now for the "base" or "core" package, what peoplethat uses setuptools do most of the time: 1- they use zc.buildout so they don't need a base package : they list in a configuration files all packages needed to build the application, and one of these package happen to have the scripts to launch the application. 2 - they have a "main" package that doesn't use the same namespace, but uses setuptools instal_requires metadata to include namespaced packages. It acts like zc.buildout in some ways. For example, you mentioned atomisator.* in your example, this app has a main package called "Atomisator" (notice the upper A) that uses strategy #2 But frankly, the "base package" scenario is not spread these days simply because it's not obvious to do it without depending on a OS that has its own strategy to install packages. For example, if you are not under debian, it's a pain to use logilab packages because they use this common namespace for several packages and a plain python installation of the various packages won't work out of the box under other systems like windows. (and for pylint, I ended up creating my own distribution for windows...) So : - having namespaces natively in Python is a big win (Namespaces are one honking great idea -- let's do more of those!) - being able to still write some code under the primary namespace is something I (and lots of people) wish we could do with setuptools, so it's a big win too. Regards, Tarek -- Tarek Ziadé | http://ziade.org
On 2009-04-15 21:22, P.J. Eby wrote:
At 02:52 PM 4/15/2009 -0400, A.M. Kuchling wrote:
On Wed, Apr 15, 2009 at 01:59:34PM -0400, P.J. Eby wrote:
Please see the large number of Zope and PEAK distributions on PyPI as minimal examples that disprove this being the common use case. I expect you will find a fair number of others, as well. ... In other words, the "base package" scenario is the exception these days, not the rule. I actually know specifically of only one other such package besides your mx.* case, the logilab ll.* package.
Isn't that pretty even, then? zope.* and PEAK are two examples of one approach; and mx.* and ll.* are two examples that use the base package approach. Neither approach seems to be the more common one, and both are pretty rare.
If you view the package listings on PyPI, you'll see that the "pure" namespaces currently in use include:
alchemist.* amplecode.* atomisator.* bda.* benri.* beyondskins.* bliptv.* bopen.* borg.* bud.* ...
This is just going down to the 'b's, looking only at packages whose PyPI project name reflects a nested package name, and only including those with entries that:
1. use setuptools, 2. declare one or more namespace packages, and 3. do not depend on some sort of "base" or "core" package.
Technically, setuptools doesn't support base packages anyway, but if the organization appeared to be based on a "core+plugins/addons" model (as opposed to "collection of packages grouped in a namespace") I didn't include it in the list above -- i.e., I'm bending over backwards to be fair in the count.
Hmm, setuptools doesn't support the notion of base packages, ie. packages that provide their own __init__.py module, so I fail to see how your list or any other list of setuptools-depend packages can be taken as indicator for anything related to base packages. Since setuptools probably introduced the idea of namespace sharing packages to many authors in the first place, such a list is even less appropriate to use as sample base. That said, I don't think such statistics provide any useful information to decide on the namespace import strategy standard for Python which is the subject of the PEP. They just show that one helper-based mechanism is used more than others and that's simply a consequence of there not being a standard built-in way of using namespace packages in Python. Whether base packages are useful or not is really a side aspect of the PEP and my proposal. I'm more after a method that doesn't add more .pkg file cruft to Python's import mechanism. Those .pth files were originally meant to help older Python "packages" (think the early PIL or Numeric extensions) to integrate nicely into the new scheme without having to fully support dotted package names right from the start. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 15 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
At 10:20 PM 4/15/2009 +0200, M.-A. Lemburg wrote:
Whether base packages are useful or not is really a side aspect of the PEP and my proposal.
It's not whether they're useful, it's whether they're required. Your proposal *requires* base packages, and for people who intend to use pure packages, this is NOT a feature: it's a bug. Specifically, it introduces a large number of unnecessary, boilerplate dependencies to their package distribution strategy.
M.-A. Lemburg writes:
Hmm, setuptools doesn't support the notion of base packages, ie. packages that provide their own __init__.py module, so I fail to see how your list or any other list of setuptools-depend packages can be taken as indicator for anything related to base packages.
AFAICS the only things PJE has said about base packages is that (a) they aren't a universal use case for namespace packages, and (b) he'd like to be able to support them in setuptools, but admits that at present they aren't. Your arguments against the PEP supporting namespace packages as currently supported by setuptools seem purely theoretical to me, while he's defending an actual and common use case. "Although practicality beats purity." I think that for this PEP it's more important to unify the various use cases for namespace packages than it is to get rid of the .pth files.
At 09:59 AM 4/16/2009 +0900, Stephen J. Turnbull wrote:
I think that for this PEP it's more important to unify the various use cases for namespace packages than it is to get rid of the .pth files.
Actually, Martin's proposal *does* get rid of the .pth files in site-packages, and replaces them with other files inside the individual packages. (Thereby speeding startup times when many namespace packages are present but only a few are used.) So Martin's proposal is a win for performance and even for decreasing clutter. (The same number of special files will be present, but they will be moved inside the namespace package directories instead of being in the parent directory.)
AFAICS the only things PJE has said about base packages is that
(a) they aren't a universal use case for namespace packages, and (b) he'd like to be able to support them in setuptools, but admits that at present they aren't.
...and that Martin's proposal would actually permit me to do so, whereas MAL's proposal would not. Replacing __init__.py with a __pkg__.py wouldn't change any of the tradeoffs for how setuptools handles namespace packages, except to add an extra variable to consider (i.e., two filenames to keep track of).
M.-A. Lemburg wrote:
""" If the package really requires adding one or more directories on sys.path (e.g. because it has not yet been structured to support dotted-name import), a "path configuration file" named package.pth can be placed in either the site-python or site-packages directory. ... A typical installation should have no or very few .pth files or something is wrong, and if you need to play with the search order, something is very wrong. """
I'll say! I think .pth files are absolute evil and I wish they could just be banned. +1 on anything that makes them closer to going away or reduces the possibility of yet another similar feature from hurting the comprehensibility of a python setup. Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
M.-A. Lemburg wrote:
The much more common use case is that of wanting to have a base package installation which optional add-ons that live in the same logical package namespace.
The PEP provides a way to solve this use case by giving both developers and users a standard at hand which they can follow without having to rely on some non-standard helpers and across Python implementations.
My proposal tries to solve this without adding yet another .pth file like mechanism - hopefully in the spirit of the original Python package idea.
Okay, I need to issue a plea for a little help. I think I kinda get what this PEP is about now, and as someone who wants to ship a base package with several add-ons that live in the same logical package namespace, I'm very interested. However, despite trying to follow this thread *and* having tried to read the PEP a couple of times, I still don't know how I'd go about doing this. I did give some examples from what I'd be looking to do much earlier. I'll ask again in the vague hope of you or someone else explaining things to me like I'm a 5 year old - something I'm mentally equipped to be well ;-) In either of the proposals on the table, what code would I write and where to have a base package with a set of add-on packages? Simple examples would be greatly appreciated, and might bring things into focus for some of the less mentally able bystanders - like myself! cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
P.J. Eby wrote:
At 06:15 PM 4/15/2009 +0200, M.-A. Lemburg wrote:
The much more common use case is that of wanting to have a base package installation which optional add-ons that live in the same logical package namespace.
Please see the large number of Zope and PEAK distributions on PyPI as minimal examples that disprove this being the common use case.
If you mean "the common use case as opposed to having code in the __init__.py of the namespace package", I think you'll find that's because people (especially me!) don't know how to do this, not because we don't want to! Chris - who would actually like to know how to do this, with or without the PEP, and how to indicate interdependencies in situations like this to setuptools... -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
P.J. Eby wrote:
It's unclear, however, who is using base packages besides mx.* and ll.*, although I'd guess from the PyPI listings that perhaps Django is. (It seems that "base" packages are more likely to use a 'base-extension' naming pattern, vs. the 'namespace.project' pattern used by "pure" packages.)
I'll stress it again in case you missed it the first time: I think the main reason people use "pure namespace" versus "base namespace" packages is because hardly anyone know how to do the latter, not because there is no desire to do so! I, for one, have been trying to figure out how to do "base namespace" packages for years... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
In either of the proposals on the table, what code would I write and where to have a base package with a set of add-on packages?
I don't quite understand the question. Why would you want to write code (except for the code that actually is in the packages)? PEP 382 is completely declarative - no need to write code. Regards, Martin
Martin v. Löwis wrote:
In either of the proposals on the table, what code would I write and where to have a base package with a set of add-on packages?
I don't quite understand the question. Why would you want to write code (except for the code that actually is in the packages)?
PEP 382 is completely declarative - no need to write code.
"code" is anything I need to write to make this work... So, what do I need to do? Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
In either of the proposals on the table, what code would I write and where to have a base package with a set of add-on packages?
I don't quite understand the question. Why would you want to write code (except for the code that actually is in the packages)?
PEP 382 is completely declarative - no need to write code.
"code" is anything I need to write to make this work...
So, what do I need to do?
Ok, so create three tar files: 1. base.tar, containing simplistix/ simplistix/__init__.py 2. addon1.tar, containing simplistix/addon1.pth (containing a single "*") simplistix/feature1.py 3. addon2.tar, containing simplistix/addon2.pth simplistix/feature2.py Unpack each of them anywhere on sys.path, in any order. Regards, Martin
It's unclear, however, who is using base packages besides mx.* and ll.*, although I'd guess from the PyPI listings that perhaps Django is. (It seems that "base" packages are more likely to use a 'base-extension' naming pattern, vs. the 'namespace.project' pattern used by "pure" packages.)
I'll stress it again in case you missed it the first time: I think the main reason people use "pure namespace" versus "base namespace" packages is because hardly anyone know how to do the latter, not because there is no desire to do so!
I, for one, have been trying to figure out how to do "base namespace" packages for years...
You mean, without PEP 382? That won't be possible, unless you can coordinate all addon packages. Base packages are a feature solely of PEP 382. Regards, Martin
At 05:35 PM 5/1/2009 +0100, Chris Withers wrote:
P.J. Eby wrote:
It's unclear, however, who is using base packages besides mx.* and ll.*, although I'd guess from the PyPI listings that perhaps Django is. (It seems that "base" packages are more likely to use a 'base-extension' naming pattern, vs. the 'namespace.project' pattern used by "pure" packages.)
I'll stress it again in case you missed it the first time: I think the main reason people use "pure namespace" versus "base namespace" packages is because hardly anyone know how to do the latter, not because there is no desire to do so!
I didn't say there's *no* desire, however IIRC the only person who *ever* asked on distutils-sig how to do a base package with setuptools was the author of the ll.* packages. And in the case of at least the zope.* peak.* and osaf.* namespace packages it was specifically *not* the intention to have a base __init__.
At 07:41 PM 5/1/2009 +0200, Martin v. Löwis wrote:
It's unclear, however, who is using base packages besides mx.* and ll.*, although I'd guess from the PyPI listings that perhaps Django is. (It seems that "base" packages are more likely to use a 'base-extension' naming pattern, vs. the 'namespace.project' pattern used by "pure" packages.)
I'll stress it again in case you missed it the first time: I think the main reason people use "pure namespace" versus "base namespace" packages is because hardly anyone know how to do the latter, not because there is no desire to do so!
I, for one, have been trying to figure out how to do "base namespace" packages for years...
You mean, without PEP 382?
That won't be possible, unless you can coordinate all addon packages. Base packages are a feature solely of PEP 382.
Actually, if you are using only the distutils, you can do this by listing only modules in the addon projects; this is how the ll.* tools are doing it. That only works if the packages are all being installed in the same directory, though, not as eggs.
Actually, if you are using only the distutils, you can do this by listing only modules in the addon projects; this is how the ll.* tools are doing it. That only works if the packages are all being installed in the same directory, though, not as eggs.
Right: if all portions install into the same directory, you can have base packages already. Regards, Martin
P.J. Eby wrote:
I didn't say there's *no* desire, however IIRC the only person who *ever* asked on distutils-sig how to do a base package with setuptools was the author of the ll.* packages.
I've asked before ;-) Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Martin v. Löwis wrote:
I, for one, have been trying to figure out how to do "base namespace" packages for years...
You mean, without PEP 382?
That won't be possible, unless you can coordinate all addon packages. Base packages are a feature solely of PEP 382.
Marc-Andre has achieved this, I think, without the PEP, but I never really understood how :-S Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Martin v. Löwis wrote:
Ok, so create three tar files:
1. base.tar, containing
simplistix/ simplistix/__init__.py
So this __init__.py can have code in it? And base.tar can have other modules and subpackages in it? What happens if the base and an addon both define a package called simplistix.somepackage?
2. addon1.tar, containing
simplistix/addon1.pth (containing a single "*")
What does that * mean? I thought .pth files just had python in them?
Unpack each of them anywhere on sys.path, in any order.
How would this work if base, addon1 and addon2 were eggs managed by buildout or setuptools? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Ok, so create three tar files:
1. base.tar, containing
simplistix/ simplistix/__init__.py
So this __init__.py can have code in it?
That's the point, yes.
And base.tar can have other modules and subpackages in it?
Certainly, yes.
What happens if the base and an addon both define a package called simplistix.somepackage?
Depends on whether simplistix.somepackage is a namespace package (it should). If so, they get merged just as any other namespace package.
2. addon1.tar, containing
simplistix/addon1.pth (containing a single "*")
What does that * mean?
See PEP 382 (search for "*").
I thought .pth files just had python in them?
Not at all - they never did. They have paths in them.
Unpack each of them anywhere on sys.path, in any order.
How would this work if base, addon1 and addon2 were eggs managed by buildout or setuptools?
What is a managed egg (i.e. what kind of management does buildout or setuptools apply to it)? Regards, Martin
-On [20090501 20:59], "Martin v. Löwis" (martin@v.loewis.de) wrote:
Right: if all portions install into the same directory, you can have base packages already.
Speaking as a user of packages, this use case is one I hardly ever encounter with the Python software/modules/packages I use. The only ones that spring to mind are the mx.* and ll.* packages. The rest simply create their own namespace as <package>.*, but there's nothing that uses that same namespace and installs separately from the base package that I know of. -- Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Knowledge was inherent in all things. The world was a library...
Right: if all portions install into the same directory, you can have base packages already.
Speaking as a user of packages, this use case is one I hardly ever encounter with the Python software/modules/packages I use. The only ones that spring to mind are the mx.* and ll.* packages. The rest simply create their own namespace as <package>.*, but there's nothing that uses that same namespace and installs separately from the base package that I know of.
There are a few others, though: zope.*, repoze.*, redturtle.*, iw.*, plone.*, pycopia.*, p4a.*, plonehrm.*, plonetheme.*, pbp.*, lovely.*, xm.*, paste.*, Products.*, buildout.*, five.*, silva.*, tl.*, tw.*, themerubber.*, themetweaker.*, zc.*, z3c.*, zgeo.*, z3ext.*, etc. Regards, Martin
-On [20090509 13:40], "Martin v. Löwis" (martin@v.loewis.de) wrote:
There are a few others, though: zope.*, repoze.*, redturtle.*, iw.*, plone.*, pycopia.*, p4a.*, plonehrm.*, plonetheme.*, pbp.*, lovely.*, xm.*, paste.*, Products.*, buildout.*, five.*, silva.*, tl.*, tw.*, themerubber.*, themetweaker.*, zc.*, z3c.*, zgeo.*, z3ext.*, etc.
Can be fairly said, though, that the majority of those you just named are related to Zope? That would explain why I won't know of them as I avoid Zope like the plague. -- Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Hope is a letter that never arrives, delivered by the postman of my fear...
.pth files are why I can't easily use GNU stow with easy_install. If installing a Python package involved writing new files into the filesystem, but did not require reading, updating, and re-writing any extant files such as .pth files, then GNU stow would Just Work with easy_install the way it Just Works with most things. Regards, Zooko
Jeroen Ruigrok van der Werven wrote:
-On [20090509 13:40], "Martin v. Löwis" (martin@v.loewis.de) wrote:
There are a few others, though: zope.*, repoze.*, redturtle.*, iw.*, plone.*, pycopia.*, p4a.*, plonehrm.*, plonetheme.*, pbp.*, lovely.*, xm.*, paste.*, Products.*, buildout.*, five.*, silva.*, tl.*, tw.*, themerubber.*, themetweaker.*, zc.*, z3c.*, zgeo.*, z3ext.*, etc.
Can be fairly said, though, that the majority of those you just named are related to Zope?
They're also all pure namespace packages rather than base + addons, which is what we've been discussing...
That would explain why I won't know of them as I avoid Zope like the plague.
More fool you... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Martin v. Löwis wrote:
So this __init__.py can have code in it?
That's the point, yes.
And base.tar can have other modules and subpackages in it?
Certainly, yes.
Great, when is the PEP due to land in 2.x? ;-)
What happens if the base and an addon both define a package called simplistix.somepackage?
Depends on whether simplistix.somepackage is a namespace package (it should). If so, they get merged just as any other namespace package.
Sorry, I was looking at potential bug cases here. What happens if it's not a namespace package?
See PEP 382 (search for "*").
I thought .pth files just had python in them?
Not at all - they never did. They have paths in them.
I've certainly seen them with python in, and that's what I hate about them...
Unpack each of them anywhere on sys.path, in any order. How would this work if base, addon1 and addon2 were eggs managed by buildout or setuptools?
What is a managed egg (i.e. what kind of management does buildout or setuptools apply to it)?
Sorry, bad wording on my part... I guess I meant more how would buildout/setuptools go about installing/uninstalling/etc packages thatconform to PEP 382? Would setuptools/buildout need modification or would the changes take effect lower down in the stack? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
-On [20090509 16:07], Chris Withers (chris@simplistix.co.uk) wrote:
They're also all pure namespace packages rather than base + addons, which is what we've been discussing...
But from Martin's email I understood it more as being base packages. Unless I misunderstood, of course. If correct, which is it?
More fool you...
Maybe, used/worked with it and don't care for it one iota. But that's a whole different discussion. -- Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Naritai jibun wo surikaetemo egao wa itsudemo suteki desuka...
Zooko O'Whielacronx wrote:
.pth files are why I can't easily use GNU stow with easy_install. If installing a Python package involved writing new files into the filesystem, but did not require reading, updating, and re-writing any extant files such as .pth files, then GNU stow would Just Work with easy_install the way it Just Works with most things.
Please understand that this is the fault of easy_install, not of .pth files. There is no technical need for easy_install to rewrite .pth files on installation. It could just as well have created new .pth files, rather than modifying existing ones. If you always use --single-version-externally-managed with easy_install, it will stop editing .pth files on installation. Regards, Martin
Chris Withers wrote:
Martin v. Löwis wrote:
So this __init__.py can have code in it?
That's the point, yes.
And base.tar can have other modules and subpackages in it?
Certainly, yes.
Great, when is the PEP due to land in 2.x? ;-)
Most likely, never - it probably will be implemented only after the last feature release of 2.x was made.
What happens if the base and an addon both define a package called simplistix.somepackage?
Depends on whether simplistix.somepackage is a namespace package (it should). If so, they get merged just as any other namespace package.
Sorry, I was looking at potential bug cases here. What happens if it's not a namespace package?
Then it will be imported as a regular child package.
Unpack each of them anywhere on sys.path, in any order. How would this work if base, addon1 and addon2 were eggs managed by buildout or setuptools?
What is a managed egg (i.e. what kind of management does buildout or setuptools apply to it)?
Sorry, bad wording on my part... I guess I meant more how would buildout/setuptools go about installing/uninstalling/etc packages thatconform to PEP 382? Would setuptools/buildout need modification or would the changes take effect lower down in the stack?
Unfortunately, I don't know precisely what they do, so I don't know whether any of it needs modification. All I can say is that if they want to install namespace packages using the mechanism of PEP 382, they will have to produce the file layout specified in the PEP. For distutils (which is the only library in that area that I do know), I think just installing any .pth files inside a package would be sufficient. Regards, Martin
Jeroen Ruigrok van der Werven wrote:
-On [20090509 16:07], Chris Withers (chris@simplistix.co.uk) wrote:
They're also all pure namespace packages rather than base + addons, which is what we've been discussing...
But from Martin's email I understood it more as being base packages. Unless I misunderstood, of course.
If correct, which is it?
The list I gave you was a list of distributions that include namespace packages (using the setuptools mechanism). I don't think that any of them has the notion of a base package, as the setuptools mechanism doesn't support base packages. Regards, Martin
At 04:18 PM 5/9/2009 +0200, Martin v. Löwis wrote:
Zooko O'Whielacronx wrote:
.pth files are why I can't easily use GNU stow with easy_install. If installing a Python package involved writing new files into the filesystem, but did not require reading, updating, and re-writing any extant files such as .pth files, then GNU stow would Just Work with easy_install the way it Just Works with most things.
Please understand that this is the fault of easy_install, not of .pth files. There is no technical need for easy_install to rewrite .pth files on installation. It could just as well have created new .pth files, rather than modifying existing ones.
If you always use --single-version-externally-managed with easy_install, it will stop editing .pth files on installation.
It's --multi-version (-m) that does that. --single-version-externally-managed is a "setup.py install" option. Both have the effect of not editing .pth files, but they do so in different ways. The "setup.py install" option causes it to install in a distutils-compatible layout, whereas --multi-version simply drops .egg files or directories in the target location and leaves it to the user (or the generated script wrappers) to add them to sys.path.
If you always use --single-version-externally-managed with easy_install, it will stop editing .pth files on installation.
It's --multi-version (-m) that does that. --single-version-externally-managed is a "setup.py install" option.
Both have the effect of not editing .pth files, but they do so in different ways. The "setup.py install" option causes it to install in a distutils-compatible layout, whereas --multi-version simply drops .egg files or directories in the target location and leaves it to the user (or the generated script wrappers) to add them to sys.path.
Ah, ok. Is there also an easy_install invocation that unpacks the zip file into some location of sys.path (which then wouldn't require editing sys.path)? Regards, Martin
At 04:42 PM 5/9/2009 +0200, Martin v. Löwis wrote:
If you always use --single-version-externally-managed with easy_install, it will stop editing .pth files on installation.
It's --multi-version (-m) that does that. --single-version-externally-managed is a "setup.py install" option.
Both have the effect of not editing .pth files, but they do so in different ways. The "setup.py install" option causes it to install in a distutils-compatible layout, whereas --multi-version simply drops .egg files or directories in the target location and leaves it to the user (or the generated script wrappers) to add them to sys.path.
Ah, ok. Is there also an easy_install invocation that unpacks the zip file into some location of sys.path (which then wouldn't require editing sys.path)?
Not as yet. I'm sort of waiting to see what comes out of PEP 376 discussions re: an installation manifest... but then, if I actually had time to work on it right now, I'd probably just implement something. Currently, you can use pip to do that, though, as long as the packages you want are in source form. pip doesn't unzip eggs as yet. It would be really straightforward, though, for someone to implement an easy_install variant that does this. Just invoke "easy_install -Zmaxd /some/tmpdir packagelist" to get a full set of unpacked .egg directories in /some/tmpdir, and then move the contents of the resulting .egg subdirs to the target location, renaming EGG-INFO subdirs to projectname-version.egg-info subdirs. (Of course, this ignores the issue of uninstalling previous versions, or overwriting of conflicting files in the target -- does pip handle these?)
2009/5/9 Chris Withers <chris@simplistix.co.uk>:
Martin v. Löwis wrote:
I thought .pth files just had python in them?
Not at all - they never did. They have paths in them.
I've certainly seen them with python in, and that's what I hate about them...
AIUI, there was a small special case that lines starting with "import" are executed (see the source of site.py for details). This exception has been exploited (some would say "abused", but I'm trying to be unbiased here) by setuptools, at least, to do path manipulations and such. PEP 382 does not provide the import exception: "Unlike .pth files on the top level, lines starting with "import" are not supported in per-package .pth files". It's not clear to me what impact this would have on setuptools (probably none, as top-level .pth files aren't changed). Paul.
On May 9, 2009, at 9:39 AM, P.J. Eby wrote:
It would be really straightforward, though, for someone to implement an easy_install variant that does this. Just invoke "easy_install -Zmaxd /some/tmpdir packagelist" to get a full set of unpacked .egg directories in /some/tmpdir, and then move the contents of the resulting .egg subdirs to the target location, renaming EGG-INFO subdirs to projectname-version.egg-info subdirs.
Except for the renaming part, this is exactly what GNU stow does.
(Of course, this ignores the issue of uninstalling previous versions, or overwriting of conflicting files in the target -- does pip handle these?)
GNU stow does handle these issues. Regards, Zooko
On May 10, 2009, at 11:18 AM, Martin v. Löwis wrote:
If GNU stow solves all your problems, why do you want to use easy_install in the first place?
That's a good question. The answer is that there are two separate jobs: building executables and putting them in a directory structure of the appropriate shape for your system is one job, and installing or uninstalling that tree into your system is another. GNU stow does only the latter. The input to GNU stow is a set of executables, library files, etc., in a directory tree that is of the right shape for your system. For example, if you are on a Linux system, then your scripts all need to be in $prefix/bin/, your shared libs should be in $prefix/lib, your Python packages ought to be in $prefix/lib/python$x.$y/site- packages/, etc. GNU stow is blissfully ignorant about all issues of building binaries, and choosing where to place files, etc. -- that's the job of the build system of the package, e.g. the "./configure -- prefix=foo && make && make install" for most C packages, or the "python ./setup.py install --prefix=foo" for Python packages using distutils (footnote 1). Once GNU stow has the well-shaped directory which is the output of the build process, then it follows a very dumb, completely reversible (uninstallable) process of symlinking those files into the system directory structure. It is a beautiful, elegant hack because it is sooo dumb. It is also very nice to use the same tool to manage packages written in any programming language, provided only that they can build a directory tree of the right shape and content. However, there are lots of things that it doesn't do, such as automatically acquiring and building dependencies, or producing executables for the target platform for each of your console scripts. Not to mention creating a directory named "$prefx/lib/python $x.$y/site-packages" and cp'ing your Python files into it. That's why you still need a build system even if you use GNU stow for an install-and-uninstall system. The thing that prevents this from working with setuptools is that setuptools creates a file named easy_install.pth during the "python ./ setup.py install --prefix=foo" if you build two different Python packages this way, they will each create an easy_install.pth file, and then when you ask GNU stow to link the two resulting packages into your system, it will say "You are asking me to install two different packages which both claim that they need to write a file named '/usr/local/lib/python2.5/site-packages/easy_install.pth'. I'm too dumb to deal with this conflict, so I give up.". If I understand correctly, your (MvL's) suggestion that easy_install create a .pth file named "easy_install-$PACKAGE-$VERSION.pth" instead of "easy_install.pth" would indeed make it work with GNU stow. Regards, Zooko footnote 1: Aside from the .pth file issue, the other reason that setuptools doesn't work for this use while distutils does is that setuptools tries to hard to save you from making a mistake: maybe you don't know what you are doing if you ask it to install into a previously non-existent prefix dir "foo". This one is easier to fix: http://bugs.python.org/setuptools/issue54 # "be more like distutils with regard to --prefix=" .
Zooko Wilcox-O'Hearn wrote:
On May 10, 2009, at 11:18 AM, Martin v. Löwis wrote:
If GNU stow solves all your problems, why do you want to use easy_install in the first place?
That's a good question. The answer is that there are two separate jobs: building executables and putting them in a directory structure of the appropriate shape for your system is one job, and installing or uninstalling that tree into your system is another. GNU stow does only the latter.
And so does easy_install - it's job is *not* to build the executables and to put them in a directory structure. Instead, it's distutils/setuptools which has this job. The primary purpose of easy_install is to download the files from PyPI (IIUC).
The thing that prevents this from working with setuptools is that setuptools creates a file named easy_install.pth
It will stop doing that if you ask nicely. That's why I recommended earlier that you do ask it not to edit .pth files.
If I understand correctly, your (MvL's) suggestion that easy_install create a .pth file named "easy_install-$PACKAGE-$VERSION.pth" instead of "easy_install.pth" would indeed make it work with GNU stow.
My recommendation is that you use the already existing flag to setup.py install that stops it from editing .pth files. Regards, Martin
following-up to my own post to mention one very important reason why anyone cares: On Sun, May 10, 2009 at 12:04 PM, Zooko Wilcox-O'Hearn <zooko@zooko.com> wrote:
It is a beautiful, elegant hack because it is sooo dumb. It is also very nice to use the same tool to manage packages written in any programming language, provided only that they can build a directory tree of the right shape and content.
And, you are not relying on the author of the package that you are installing to avoid accidentally or maliciously screwing up your system. You're not even relying on the authors of the *build system* (e.g. the authors of distutils or easy_install). You are relying *only* on GNU stow to avoid accidentally or maliciously screwing up your system, and GNU stow is very dumb, so it is easy to understand what it is going to do and why that isn't going to irreversibly screw up your system. That is: you don't run the "build yourself and install into $prefix" step as root. This is an important consideration for a lot of people, who absolutely refuse on principle to ever run "sudo python ./setup.py" on a system that they care about unless they wrote the "setup.py" script themselves. (Likewise they refuse to run "sudo make install" on packages written in C.) Regards, Zooko
At 12:04 PM 5/10/2009 -0600, Zooko Wilcox-O'Hearn wrote:
The thing that prevents this from working with setuptools is that setuptools creates a file named easy_install.pth during the "python ./ setup.py install --prefix=foo" if you build two different Python packages this way, they will each create an easy_install.pth file, and then when you ask GNU stow to link the two resulting packages into your system, it will say "You are asking me to install two different packages which both claim that they need to write a file named '/usr/local/lib/python2.5/site-packages/easy_install.pth'.
Adding --record and --single-version-externally-managed to that command line will prevent the .pth file from being used or needed, although I believe you already know this. (What that mode won't do is install dependencies automatically.)
On Sun, 10 May 2009 09:41:33 -0600, Zooko Wilcox-O'Hearn <zooko@zooko.com> wrote:
(Of course, this ignores the issue of uninstalling previous versions, or overwriting of conflicting files in the target -- does pip handle these?)
GNU stow does handle these issues.
I'm not sure GNU stow will handle the .PTH when deinstalling packages. In easy_install.PTH there will be a list of all the packages installed. This list really needs to be edited once a package is removed. The .PTH files are a really good part of python. Definitely nothing evil about them. David
Talking of stow, I take advantage of this thread to do some shameless advertising :) Recently I uploaded to PyPI a software of mine, BPT [1], which does the same symlinking trick of stow, but it is written in Python (and with a simple api) and, more importantly, it allows with another trick the relocation of the installation directory (it creates a semi- isolated environment, similar to virtualenv). I find it very convenient when I have to switch between several versions of the same packages (for example during development), or I have to deploy on the same machine software that needs different versions of the dependencies. I am planning to write an integration layer with buildout and easy_install. It should be very easy, since BPT can handle directly tarballs (and directories, in trunk) which contain a setup.py. HTH, Giuseppe [1] http://pypi.python.org/pypi/bpt P.S. I was not aware of stow, I'll add it to the references and see if there are any features that I can steal
At 04:42 PM 5/9/2009 +0200, Martin v. Löwis wrote:
If you always use --single-version-externally-managed with easy_install, it will stop editing .pth files on installation.
It's --multi-version (-m) that does that. --single-version-externally-managed is a "setup.py install" option.
Both have the effect of not editing .pth files, but they do so in different ways. The "setup.py install" option causes it to install in a distutils-compatible layout, whereas --multi-version simply drops .egg files or directories in the target location and leaves it to the user (or the generated script wrappers) to add them to sys.path.
Ah, ok. Is there also an easy_install invocation that unpacks the zip file into some location of sys.path (which then wouldn't require editing sys.path)?
No; you'd have to use the -e option to easy_install to download and extract a source version of the package; then run that package's setup.py, e.g.: easy_install -eb /some/tmpdir SomeProject cd /some/tmpdir/someproject # subdir is always lowercased/normalized setup.py install --single-version-externally-managed --record=... I suspect that this is basically what pip is doing under the hood, as that would explain why it doesn't support .egg files. I previously posted code to the distutils-sig that was an .egg unpacker with appropriate renaming, though. It was untested, and assumes you already checked for collisions in the target directory, and that you're handling any uninstall manifest yourself. It could probably be modified to take a filter function, though, something like: def flatten_egg(egg_filename, extract_dir, filter=lambda s,d: d): eggbase = os.path.filename(egg_filename)+'-info' def file_filter(src, dst): if src.startswith('EGG-INFO/'): src = eggbase+s[8:] dst = os.path.join(extract_dir, *src.split('/')) return filter(src, dst) return unpack_archive(egg_filename, extract_dir, file_filter) Then you could pass in a None-returning filter function to check and accumulate collisions and generate a manifest. A second run with the default filter would do the unpacking. (This function should work with either .egg files or .egg directories as input, btw, since unpack_archive treats a directory input as if it were an archive.) Anyway, if you used "easy_install -mxd /some/tmpdir [specs]" to get your target eggs found/built, you could then run this flattening function (with appropriate filter functions) over the *.egg contents of /some/tmpdir to do the actual installation. (The reason for using -mxd instead of -Zmaxd or -zmaxd is that we don't care whether the eggs are zipped or not, and we leave out the -a so that dependencies already present on sys.path aren't copied or re-downloaded to the target; only dependencies we don't already have will get dropped in /some/tmpdir.) Of course, the devil of this is in the details; to handle conflicts and uninstalls properly you would need to know what namespace packages were in the eggs you are installing. But if you don't care about blindly overwriting things (as the distutils does not), then it's actually pretty easy to make such an unpacker. I mainly haven't made one myself because I *do* care about things being blindly overwritten.
At the last PyCon3 at Italy I've presented a new Python implementation, which you'll find at http://code.google.com/p/wpython/ WPython is a re-implementation of (some parts of) Python, which drops support for bytecode in favour of a wordcode-based model (where a is word is 16 bits wide). It also implements an hybrid stack-register virtual machine, and adds a lot of other optimizations. The slides are available in the download area, and explain the concept of wordcode, showing also how work some optimizations, comparing them with the current Python (2.6.1). Unfortunately I had not time to make extensive benchmarks with real code, so I've included some that I made with PyStone, PyBench, and a couple of simple recoursive function calls (Fibonacci and Factorial). This is the first release, and another two are scheduled; the first one to make it possibile to select (almost) any optimization to be compiled (so fine grained tests will be possibile). The latter will be a rewrite of the constant folding code (specifically for tuples, lists and dicts), removing a current "hack" to the python type system to make them "hashable" for the constants dictionary used by compile.c. Then I'll start writing some documentation that will explain what parts of code are related to a specific optimization, so that it'll be easier to create patches for other Python implementations, if needed. You'll find a bit more informations in the "README FIRST!" file present into the project's repository. I made so many changes to the source of Python 2.6.1, so feel free to ask me for any information about them. Cheers Cesare
Hi,
WPython is a re-implementation of (some parts of) Python, which drops support for bytecode in favour of a wordcode-based model (where a is word is 16 bits wide).
This is great! Have you planned to port in to the py3k branch? Or, at least, to trunk? Some opcode and VM optimizations have gone in after 2.6 was released, although nothing as invasive as you did. About the CISC-y instructions, have you tried merging the fast and const arrays in frame objects? That way, you need less opcode space (since e.g. BINARY_ADD_FAST_FAST will cater with constants as well as local variables). Regards Antoine.
Hi Cesare, On Mon, May 11, 2009 at 11:00 AM, Cesare Di Mauro <cesare.dimauro@a-tono.com> wrote:
At the last PyCon3 at Italy I've presented a new Python implementation, which you'll find at http://code.google.com/p/wpython/
Good to see some more attention on Python performance! There's quite a bit going on in your changes; do you have an optimization-by-optimization breakdown, to give an idea about how much performance each optimization gives? Looking over the slides, I see that you still need to implement functionality to make test_trace pass, for example; do you have a notion of how much performance it will cost to implement the rest of Python's semantics in these areas? Also, I checked out wpython at head to run Unladen Swallow's benchmarks against it, but it refuses to compile with either gcc 4.0.1 or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build failures off-list, if you're interested. Thanks, Collin Winter
On Mon, May 11, 2009 10:27PM, Antoine Pitrou wrote: Hi Antoine
Hi,
WPython is a re-implementation of (some parts of) Python, which drops support for bytecode in favour of a wordcode-based model (where a is word is 16 bits wide).
This is great! Have you planned to port in to the py3k branch? Or, at least, to trunk?
It was my idea too, but first I need to take a deep look at what parts of code are changed from 2.6 to 3.0. That's because I don't know how much work is required for this "forward" port.
Some opcode and VM optimizations have gone in after 2.6 was released, although nothing as invasive as you did.
:-D Interesting.
About the CISC-y instructions, have you tried merging the fast and const arrays in frame objects? That way, you need less opcode space (since e.g. BINARY_ADD_FAST_FAST will cater with constants as well as local variables).
Regards
Antoine.
It's an excellent idea, that needs exploration. Running my stats tools against all .py files found in Lib and Tools folders, I discovered that the maximum index used for fast/locals is 79, and 1853 for constants. So if I find a way to easily map locals first and constants following in the same array, your great idea can be implemented saving A LOT of opcodes and reducing ceval.c source code. I'll work on that after the two releases that I planned. Thanks for your precious suggestions! Cesare
Hi Collin On Mon, May 11, 2009 11:14PM, Collin Winter wrote:
Hi Cesare,
On Mon, May 11, 2009 at 11:00 AM, Cesare Di Mauro <cesare.dimauro@a-tono.com> wrote:
At the last PyCon3 at Italy I've presented a new Python implementation, which you'll find at http://code.google.com/p/wpython/
Good to see some more attention on Python performance! There's quite a bit going on in your changes; do you have an optimization-by-optimization breakdown, to give an idea about how much performance each optimization gives?
I planned it in the next release that will come may be next week. I'll introduce some #DEFINEs and #IFs in the code, so that only specific optimizations will be enabled.
Looking over the slides, I see that you still need to implement functionality to make test_trace pass, for example; do you have a notion of how much performance it will cost to implement the rest of Python's semantics in these areas?
Very little. That's because there are only two tests on test_trace that don't pass. I think that the reason stays in the changes that I made in the loops. With my code SETUP_LOOP and POP_BREAK are completely removed, so the code in settrace will failt to recognize the loop and the virtual machine crashes. I'll fix it in the second release that I have planned.
Also, I checked out wpython at head to run Unladen Swallow's benchmarks against it, but it refuses to compile with either gcc 4.0.1 or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build failures off-list, if you're interested.
Thanks, Collin Winter
I'm very interested, thanks. That's because I worked only on Windows machines, so I definitely need to test and fix it to let it run on any other platform. Cesare
Hi Cesare, Cesare Di Mauro <cesare.dimauro <at> a-tono.com> writes:
It was my idea too, but first I need to take a deep look at what parts of code are changed from 2.6 to 3.0. That's because I don't know how much work is required for this "forward" port.
If you have some questions or need some help, send me a message. Regards Antoine.
On Thu, May 12, 2009 01:40PM, Antoine Pitrou wrote:
Hi Cesare,
Cesare Di Mauro <cesare.dimauro <at> a-tono.com> writes:
It was my idea too, but first I need to take a deep look at what parts of code are changed from 2.6 to 3.0. That's because I don't know how much work is required for this "forward" port.
If you have some questions or need some help, send me a message.
Regards
Antoine.
OK, thanks. :) Another note. Fredrik Johansson let me note just few minutes ago that I've compiled my sources without PGO optimizations enabled. That's because I used Visual Studio Express Edition. So another gain in performances can be obtained. :) cheers Cesare
On Tue, May 12, 2009 at 4:45 AM, Cesare Di Mauro <cesare.dimauro@a-tono.com> wrote:
Another note. Fredrik Johansson let me note just few minutes ago that I've compiled my sources without PGO optimizations enabled.
That's because I used Visual Studio Express Edition.
So another gain in performances can be obtained. :)
FWIW, Unladen Swallow experimented with gcc 4.4's FDO and got an additional 10-30% (depending on the benchmark). The training load is important, though: some training sets offered better performance than others. I'd be interested in how MSVC's PGO compares to gcc's FDO in terms of overall effectiveness. The results for gcc FDO with our 2009Q1 release are at the bottom of http://code.google.com/p/unladen-swallow/wiki/Releases. Collin Winter
On Tue, May 12, 2009 05:27 PM, Collin Winter wrote:
On Tue, May 12, 2009 at 4:45 AM, Cesare Di Mauro <cesare.dimauro@a-tono.com> wrote:
Another note. Fredrik Johansson let me note just few minutes ago that I've compiled my sources without PGO optimizations enabled.
That's because I used Visual Studio Express Edition.
So another gain in performances can be obtained. :)
FWIW, Unladen Swallow experimented with gcc 4.4's FDO and got an additional 10-30% (depending on the benchmark). The training load is important, though: some training sets offered better performance than others. I'd be interested in how MSVC's PGO compares to gcc's FDO in terms of overall effectiveness. The results for gcc FDO with our 2009Q1 release are at the bottom of http://code.google.com/p/unladen-swallow/wiki/Releases.
Collin Winter
Unfortunately I can't test PGO, since I use the Express Editions of VS. May be Martin or othe mainteners of the Windows versions can help here. However it'll be difficult to find a good enough profile for the binaries distributed for the official Python. FDO brings to quite different results based on the profile selected. cheers, Cesare
Paul Moore wrote:
2009/5/9 Chris Withers <chris@simplistix.co.uk>:
I thought .pth files just had python in them? Not at all - they never did. They have paths in them. I've certainly seen them with python in, and that's what I hate about
Martin v. Löwis wrote: them...
AIUI, there was a small special case that lines starting with "import" are executed (see the source of site.py for details). This exception has been exploited (some would say "abused", but I'm trying to be unbiased here) by setuptools, at least, to do path manipulations and such.
Abused is definitely the right word, I suppose it's too late to correct this bug? How about for Python 3? cheers, Chris
On Tue, May 12, 2009 at 8:54 AM, Cesare Di Mauro <cesare.dimauro@a-tono.com> wrote:
Also, I checked out wpython at head to run Unladen Swallow's benchmarks against it, but it refuses to compile with either gcc 4.0.1 or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build failures off-list, if you're interested.
Thanks, Collin Winter
I'm very interested, thanks. That's because I worked only on Windows machines, so I definitely need to test and fix it to let it run on any other platform.
Cesare
Re-animating an old discussion -- Cesare, any news on the wpython front? I did a checkout from http://wpython.googlecode.com/svn/trunk and was able to ./configure and make successfully on my 64-bit Linux box as well as to run the Unladen benchmarks. Given svn co http://svn.python.org/projects/python/tags/r261 in py261 and svn co http://wpython.googlecode.com/svn/trunk in wpy, $ python unladen-tests/perf.py -rm --benchmarks=-2to3,all py261/python wpy/python gives the following results: Report on Linux foo 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:05:01 UTC 2009 x86_64 Total CPU cores: 2 ai: Min: 0.640516 -> 0.586532: 9.20% faster Avg: 0.677346 -> 0.632785: 7.04% faster Significant (t=4.336740, a=0.95) Stddev: 0.05839 -> 0.08455: 30.94% larger Mem max: 7412.000 -> 6768.000: 9.52% smaller Usage over time: http://tinyurl.com/ykwhmcc call_simple: Min: 1.880816 -> 1.701622: 10.53% faster Avg: 1.944320 -> 1.778701: 9.31% faster Significant (t=14.323045, a=0.95) Stddev: 0.09885 -> 0.06000: 64.74% smaller Mem max: 8100.000 -> 6636.000: 22.06% smaller Usage over time: http://tinyurl.com/yzsswgp django: Min: 1.287158 -> 1.315700: 2.17% slower Avg: 1.330423 -> 1.366978: 2.67% slower Significant (t=-4.475769, a=0.95) Stddev: 0.05663 -> 0.05885: 3.78% larger Mem max: 15508.000 -> 16228.000: 4.44% larger Usage over time: http://tinyurl.com/yfpbmjn iterative_count: Min: 0.211620 -> 0.124646: 69.78% faster Avg: 0.222778 -> 0.159868: 39.35% faster Significant (t=9.291635, a=0.95) Stddev: 0.04239 -> 0.05279: 19.69% larger Mem max: 7388.000 -> 6680.000: 10.60% smaller Usage over time: http://tinyurl.com/yj7s8h4 normal_startup: Min: 1.060017 -> 0.991366: 6.92% faster Avg: 1.189612 -> 1.170067: 1.67% faster Significant (t=2.002086, a=0.95) Stddev: 0.06942 -> 0.06864: 1.13% smaller Mem max: 3252.000 -> 4648.000: 30.03% larger Usage over time: http://tinyurl.com/ygo3bwt pickle: Min: 2.027566 -> 1.948784: 4.04% faster Avg: 2.051633 -> 2.043656: 0.39% faster Not significant Stddev: 0.03095 -> 0.07348: 57.88% larger Mem max: 8544.000 -> 7340.000: 16.40% smaller Usage over time: http://tinyurl.com/ykg9dn2 pickle_dict: Min: 1.658693 -> 1.656844: 0.11% faster Avg: 1.689483 -> 1.698176: 0.51% slower Not significant Stddev: 0.16945 -> 0.09403: 80.20% smaller Mem max: 6716.000 -> 7636.000: 12.05% larger Usage over time: http://tinyurl.com/yjhyame pickle_list: Min: 0.919083 -> 0.894758: 2.72% faster Avg: 0.956513 -> 0.921314: 3.82% faster Significant (t=2.131237, a=0.95) Stddev: 0.12744 -> 0.10506: 21.31% smaller Mem max: 6804.000 -> 8792.000: 22.61% larger Usage over time: http://tinyurl.com/ylc3ezf pybench: Min: 58781 -> 50836: 15.63% faster Avg: 60009 -> 51788: 15.87% faster regex_compile: Min: 0.934131 -> 0.862323: 8.33% faster Avg: 0.962159 -> 0.884848: 8.74% faster Significant (t=13.587168, a=0.95) Stddev: 0.04685 -> 0.03229: 45.11% smaller Mem max: 12584.000 -> 12740.000: 1.22% larger Usage over time: http://tinyurl.com/yjngu8j regex_effbot: Min: 0.130686 -> 0.122483: 6.70% faster Avg: 0.143453 -> 0.138078: 3.89% faster Not significant Stddev: 0.01864 -> 0.03177: 41.32% larger Mem max: 7652.000 -> 6660.000: 14.89% smaller Usage over time: http://tinyurl.com/ykcgntf regex_v8: Min: 0.135130 -> 0.150092: 9.97% slower Avg: 0.138027 -> 0.177309: 22.15% slower Significant (t=-8.197595, a=0.95) Stddev: 0.00258 -> 0.04785: 94.60% larger Mem max: 11124.000 -> 12236.000: 9.09% larger Usage over time: http://tinyurl.com/ykb5vzu rietveld: Min: 0.848245 -> 0.816473: 3.89% faster Avg: 1.033925 -> 1.019889: 1.38% faster Not significant Stddev: 0.11242 -> 0.13006: 13.56% larger Mem max: 23792.000 -> 24548.000: 3.08% larger Usage over time: http://tinyurl.com/yhdvz5v slowpickle: Min: 0.876736 -> 0.800203: 9.56% faster Avg: 0.932808 -> 0.870577: 7.15% faster Significant (t=5.020426, a=0.95) Stddev: 0.05600 -> 0.11059: 49.36% larger Mem max: 7200.000 -> 7276.000: 1.04% larger Usage over time: http://tinyurl.com/ykt2brq slowspitfire: Min: 1.029100 -> 0.948458: 8.50% faster Avg: 1.062486 -> 1.020777: 4.09% faster Significant (t=4.581669, a=0.95) Stddev: 0.05441 -> 0.07298: 25.44% larger Mem max: 139792.000 -> 129264.000: 8.14% smaller Usage over time: http://tinyurl.com/yh7vmlh slowunpickle: Min: 0.411744 -> 0.356784: 15.40% faster Avg: 0.444638 -> 0.393261: 13.06% faster Significant (t=7.009269, a=0.95) Stddev: 0.04147 -> 0.06044: 31.38% larger Mem max: 7132.000 -> 7848.000: 9.12% larger Usage over time: http://tinyurl.com/yfwvz3g startup_nosite: Min: 0.664456 -> 0.598770: 10.97% faster Avg: 0.933034 -> 0.761228: 22.57% faster Significant (t=20.660776, a=0.95) Stddev: 0.09645 -> 0.06728: 43.37% smaller Mem max: 1940.000 -> 1940.000: -0.00% smaller Usage over time: http://tinyurl.com/yzzxcmd threaded_count: Min: 0.220059 -> 0.138708: 58.65% faster Avg: 0.232347 -> 0.156120: 48.83% faster Significant (t=23.804797, a=0.95) Stddev: 0.01889 -> 0.02586: 26.96% larger Mem max: 6460.000 -> 7664.000: 15.71% larger Usage over time: http://tinyurl.com/yzm3awu unpack_sequence: Min: 0.000129 -> 0.000120: 7.57% faster Avg: 0.000218 -> 0.000194: 12.14% faster Significant (t=3.946194, a=0.95) Stddev: 0.00139 -> 0.00128: 8.13% smaller Mem max: 18948.000 -> 19056.000: 0.57% larger Usage over time: http://tinyurl.com/yf8es3f unpickle: Min: 1.191468 -> 1.206198: 1.22% slower Avg: 1.248471 -> 1.281957: 2.61% slower Significant (t=-2.658526, a=0.95) Stddev: 0.05513 -> 0.11325: 51.32% larger Mem max: 7776.000 -> 8676.000: 10.37% larger Usage over time: http://tinyurl.com/yz96gw2 unpickle_list: Min: 0.922200 -> 0.861167: 7.09% faster Avg: 0.955964 -> 0.976829: 2.14% slower Not significant Stddev: 0.04374 -> 0.21061: 79.23% larger Mem max: 6820.000 -> 8324.000: 18.07% larger Usage over time: http://tinyurl.com/yjbraxg --- The diff between the two trees is at http://dpaste.org/RpIv/ Best, Mart Sõmermaa
Hi Mart I had some problems and little time to dedicate to wpython in the last period, but I restarted again with it in the last month. Currently I'm working on changing and documenting the code so that almost every optimization can be selected. So you'll compile it enabling only the ones you are interested in. I've also investigated about some ideas which Antoine told me on grouping together FASTs and CONSTs in order to reduce bytecodes, but I've found that the suggested solution brings some problems with the current function call implementation that can hurt performance on some situations (mostly with recursive ones, because usually they need to create new frames, and constants references must be copied and INCREFed). Since it will require huge changes to the current code base, I don't know if it's worth the effort just to verify the idea. I'll think about it when the project will be "finalized". My plan is to finish the current work in a few days, and then remove the (may be ugly) hacks that I made to the Python object model that were needed to let tuples, lists and dictionaries be loaded as CONSTs. May be a the end of the month it'll be fixed (and the diffs against CPython will be reduced a lot, since a few files results changed). Next, I need to changed the trace code (in frameobject.c) to let the test_trace.py pass (at this time two tests are disabled because the VM crashes). Finally, I think to update the code base to 2.6.4. I think to release everything at the end of the year, but if someone is interested I can do a partial release at the end of November. Regarding your tests, they are very interesting, particularly for regex_v8 that showed an unexpected result for me. I'll investigate about it after I'll release wpython. I you have any questions, I'm at your disposal (thanks for your tests!) Cesare 2009/11/4 Mart Sõmermaa <mrts.pydev@gmail.com>
On Tue, May 12, 2009 at 8:54 AM, Cesare Di Mauro <cesare.dimauro@a-tono.com> wrote:
Also, I checked out wpython at head to run Unladen Swallow's benchmarks against it, but it refuses to compile with either gcc 4.0.1 or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build failures off-list, if you're interested.
Thanks, Collin Winter
I'm very interested, thanks. That's because I worked only on Windows machines, so I definitely need to test and fix it to let it run on any other platform.
Cesare
Re-animating an old discussion -- Cesare, any news on the wpython front?
I did a checkout from http://wpython.googlecode.com/svn/trunk and was able to ./configure and make successfully on my 64-bit Linux box as well as to run the Unladen benchmarks.
Given svn co http://svn.python.org/projects/python/tags/r261 in py261 and svn co http://wpython.googlecode.com/svn/trunk in wpy,
$ python unladen-tests/perf.py -rm --benchmarks=-2to3,all py261/python wpy/python
gives the following results:
Report on Linux foo 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:05:01 UTC 2009 x86_64 Total CPU cores: 2
ai: Min: 0.640516 -> 0.586532: 9.20% faster Avg: 0.677346 -> 0.632785: 7.04% faster Significant (t=4.336740, a=0.95) Stddev: 0.05839 -> 0.08455: 30.94% larger
Mem max: 7412.000 -> 6768.000: 9.52% smaller Usage over time: http://tinyurl.com/ykwhmcc
call_simple: Min: 1.880816 -> 1.701622: 10.53% faster Avg: 1.944320 -> 1.778701: 9.31% faster Significant (t=14.323045, a=0.95) Stddev: 0.09885 -> 0.06000: 64.74% smaller
Mem max: 8100.000 -> 6636.000: 22.06% smaller Usage over time: http://tinyurl.com/yzsswgp
django: Min: 1.287158 -> 1.315700: 2.17% slower Avg: 1.330423 -> 1.366978: 2.67% slower Significant (t=-4.475769, a=0.95) Stddev: 0.05663 -> 0.05885: 3.78% larger
Mem max: 15508.000 -> 16228.000: 4.44% larger Usage over time: http://tinyurl.com/yfpbmjn
iterative_count: Min: 0.211620 -> 0.124646: 69.78% faster Avg: 0.222778 -> 0.159868: 39.35% faster Significant (t=9.291635, a=0.95) Stddev: 0.04239 -> 0.05279: 19.69% larger
Mem max: 7388.000 -> 6680.000: 10.60% smaller Usage over time: http://tinyurl.com/yj7s8h4
normal_startup: Min: 1.060017 -> 0.991366: 6.92% faster Avg: 1.189612 -> 1.170067: 1.67% faster Significant (t=2.002086, a=0.95) Stddev: 0.06942 -> 0.06864: 1.13% smaller
Mem max: 3252.000 -> 4648.000: 30.03% larger Usage over time: http://tinyurl.com/ygo3bwt
pickle: Min: 2.027566 -> 1.948784: 4.04% faster Avg: 2.051633 -> 2.043656: 0.39% faster Not significant Stddev: 0.03095 -> 0.07348: 57.88% larger
Mem max: 8544.000 -> 7340.000: 16.40% smaller Usage over time: http://tinyurl.com/ykg9dn2
pickle_dict: Min: 1.658693 -> 1.656844: 0.11% faster Avg: 1.689483 -> 1.698176: 0.51% slower Not significant Stddev: 0.16945 -> 0.09403: 80.20% smaller
Mem max: 6716.000 -> 7636.000: 12.05% larger Usage over time: http://tinyurl.com/yjhyame
pickle_list: Min: 0.919083 -> 0.894758: 2.72% faster Avg: 0.956513 -> 0.921314: 3.82% faster Significant (t=2.131237, a=0.95) Stddev: 0.12744 -> 0.10506: 21.31% smaller
Mem max: 6804.000 -> 8792.000: 22.61% larger Usage over time: http://tinyurl.com/ylc3ezf
pybench: Min: 58781 -> 50836: 15.63% faster Avg: 60009 -> 51788: 15.87% faster
regex_compile: Min: 0.934131 -> 0.862323: 8.33% faster Avg: 0.962159 -> 0.884848: 8.74% faster Significant (t=13.587168, a=0.95) Stddev: 0.04685 -> 0.03229: 45.11% smaller
Mem max: 12584.000 -> 12740.000: 1.22% larger Usage over time: http://tinyurl.com/yjngu8j
regex_effbot: Min: 0.130686 -> 0.122483: 6.70% faster Avg: 0.143453 -> 0.138078: 3.89% faster Not significant Stddev: 0.01864 -> 0.03177: 41.32% larger
Mem max: 7652.000 -> 6660.000: 14.89% smaller Usage over time: http://tinyurl.com/ykcgntf
regex_v8: Min: 0.135130 -> 0.150092: 9.97% slower Avg: 0.138027 -> 0.177309: 22.15% slower Significant (t=-8.197595, a=0.95) Stddev: 0.00258 -> 0.04785: 94.60% larger
Mem max: 11124.000 -> 12236.000: 9.09% larger Usage over time: http://tinyurl.com/ykb5vzu
rietveld: Min: 0.848245 -> 0.816473: 3.89% faster Avg: 1.033925 -> 1.019889: 1.38% faster Not significant Stddev: 0.11242 -> 0.13006: 13.56% larger
Mem max: 23792.000 -> 24548.000: 3.08% larger Usage over time: http://tinyurl.com/yhdvz5v
slowpickle: Min: 0.876736 -> 0.800203: 9.56% faster Avg: 0.932808 -> 0.870577: 7.15% faster Significant (t=5.020426, a=0.95) Stddev: 0.05600 -> 0.11059: 49.36% larger
Mem max: 7200.000 -> 7276.000: 1.04% larger Usage over time: http://tinyurl.com/ykt2brq
slowspitfire: Min: 1.029100 -> 0.948458: 8.50% faster Avg: 1.062486 -> 1.020777: 4.09% faster Significant (t=4.581669, a=0.95) Stddev: 0.05441 -> 0.07298: 25.44% larger
Mem max: 139792.000 -> 129264.000: 8.14% smaller Usage over time: http://tinyurl.com/yh7vmlh
slowunpickle: Min: 0.411744 -> 0.356784: 15.40% faster Avg: 0.444638 -> 0.393261: 13.06% faster Significant (t=7.009269, a=0.95) Stddev: 0.04147 -> 0.06044: 31.38% larger
Mem max: 7132.000 -> 7848.000: 9.12% larger Usage over time: http://tinyurl.com/yfwvz3g
startup_nosite: Min: 0.664456 -> 0.598770: 10.97% faster Avg: 0.933034 -> 0.761228: 22.57% faster Significant (t=20.660776, a=0.95) Stddev: 0.09645 -> 0.06728: 43.37% smaller
Mem max: 1940.000 -> 1940.000: -0.00% smaller Usage over time: http://tinyurl.com/yzzxcmd
threaded_count: Min: 0.220059 -> 0.138708: 58.65% faster Avg: 0.232347 -> 0.156120: 48.83% faster Significant (t=23.804797, a=0.95) Stddev: 0.01889 -> 0.02586: 26.96% larger
Mem max: 6460.000 -> 7664.000: 15.71% larger Usage over time: http://tinyurl.com/yzm3awu
unpack_sequence: Min: 0.000129 -> 0.000120: 7.57% faster Avg: 0.000218 -> 0.000194: 12.14% faster Significant (t=3.946194, a=0.95) Stddev: 0.00139 -> 0.00128: 8.13% smaller
Mem max: 18948.000 -> 19056.000: 0.57% larger Usage over time: http://tinyurl.com/yf8es3f
unpickle: Min: 1.191468 -> 1.206198: 1.22% slower Avg: 1.248471 -> 1.281957: 2.61% slower Significant (t=-2.658526, a=0.95) Stddev: 0.05513 -> 0.11325: 51.32% larger
Mem max: 7776.000 -> 8676.000: 10.37% larger Usage over time: http://tinyurl.com/yz96gw2
unpickle_list: Min: 0.922200 -> 0.861167: 7.09% faster Avg: 0.955964 -> 0.976829: 2.14% slower Not significant Stddev: 0.04374 -> 0.21061: 79.23% larger
Mem max: 6820.000 -> 8324.000: 18.07% larger Usage over time: http://tinyurl.com/yjbraxg
---
The diff between the two trees is at http://dpaste.org/RpIv/
Best, Mart Sõmermaa _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/cesare.di.mauro%40gmail.co...
On Wed, Nov 4, 2009 at 4:20 AM, Mart Sõmermaa <mrts.pydev@gmail.com> wrote:
On Tue, May 12, 2009 at 8:54 AM, Cesare Di Mauro <cesare.dimauro@a-tono.com> wrote:
Also, I checked out wpython at head to run Unladen Swallow's benchmarks against it, but it refuses to compile with either gcc 4.0.1 or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build failures off-list, if you're interested.
Thanks, Collin Winter
I'm very interested, thanks. That's because I worked only on Windows machines, so I definitely need to test and fix it to let it run on any other platform.
Cesare
Re-animating an old discussion -- Cesare, any news on the wpython front?
I did a checkout from http://wpython.googlecode.com/svn/trunk and was able to ./configure and make successfully on my 64-bit Linux box as well as to run the Unladen benchmarks.
Given svn co http://svn.python.org/projects/python/tags/r261 in py261 and svn co http://wpython.googlecode.com/svn/trunk in wpy,
$ python unladen-tests/perf.py -rm --benchmarks=-2to3,all py261/python wpy/python
Do note that the --track_memory option to perf.py imposes some overhead that interferes with the performance figures. I'd recommend running the benchmarks again without --track_memory. That extra overhead is almost certainly what's causing some of the variability in the results. Collin Winter
On Wed, Nov 4, 2009 at 5:54 PM, Collin Winter <collinw@gmail.com> wrote:
Do note that the --track_memory option to perf.py imposes some overhead that interferes with the performance figures.
Thanks for the notice, without -m/--track_memory the deviation in results is indeed much smaller.
I'd recommend running the benchmarks again without --track_memory.
Done: $ python unladen-tests/perf.py -r --benchmarks=-2to3,all py261/python wpy/python Report on Linux zeus 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:05:01 UTC 2009 x86_64 Total CPU cores: 2 ai: Min: 0.629343 -> 0.576259: 9.21% faster Avg: 0.634689 -> 0.581551: 9.14% faster Significant (t=39.404870, a=0.95) Stddev: 0.01259 -> 0.00484: 160.04% smaller call_simple: Min: 1.796710 -> 1.700046: 5.69% faster Avg: 1.801533 -> 1.716367: 4.96% faster Significant (t=137.452069, a=0.95) Stddev: 0.00522 -> 0.00333: 56.64% smaller django: Min: 1.280840 -> 1.275350: 0.43% faster Avg: 1.287179 -> 1.287233: 0.00% slower Not significant Stddev: 0.01055 -> 0.00581: 81.60% smaller iterative_count: Min: 0.211744 -> 0.123271: 71.77% faster Avg: 0.213148 -> 0.128596: 65.75% faster Significant (t=88.510311, a=0.95) Stddev: 0.00233 -> 0.00926: 74.80% larger normal_startup: Min: 0.520829 -> 0.516412: 0.86% faster Avg: 0.559170 -> 0.554678: 0.81% faster Not significant Stddev: 0.02031 -> 0.02093: 2.98% larger pickle: Min: 1.988127 -> 1.926643: 3.19% faster Avg: 2.000676 -> 1.936185: 3.33% faster Significant (t=36.712505, a=0.95) Stddev: 0.01650 -> 0.00603: 173.67% smaller pickle_dict: Min: 1.681116 -> 1.619192: 3.82% faster Avg: 1.701952 -> 1.629548: 4.44% faster Significant (t=34.513963, a=0.95) Stddev: 0.01721 -> 0.01200: 43.46% smaller pickle_list: Min: 0.918128 -> 0.884967: 3.75% faster Avg: 0.925534 -> 0.891200: 3.85% faster Significant (t=60.451407, a=0.95) Stddev: 0.00496 -> 0.00276: 80.00% smaller pybench: Min: 58692 -> 51128: 14.79% faster Avg: 59914 -> 52316: 14.52% faster regex_compile: Min: 0.894190 -> 0.816447: 9.52% faster Avg: 0.900353 -> 0.826003: 9.00% faster Significant (t=24.974080, a=0.95) Stddev: 0.00448 -> 0.02943: 84.78% larger regex_effbot: Min: 0.124442 -> 0.123750: 0.56% faster Avg: 0.134908 -> 0.126137: 6.95% faster Significant (t=5.496357, a=0.95) Stddev: 0.01581 -> 0.00218: 625.68% smaller regex_v8: Min: 0.132730 -> 0.143494: 7.50% slower Avg: 0.134287 -> 0.147387: 8.89% slower Significant (t=-40.654627, a=0.95) Stddev: 0.00108 -> 0.00304: 64.34% larger rietveld: Min: 0.754050 -> 0.737335: 2.27% faster Avg: 0.770227 -> 0.754642: 2.07% faster Significant (t=7.547765, a=0.95) Stddev: 0.01434 -> 0.01486: 3.49% larger slowpickle: Min: 0.858494 -> 0.795162: 7.96% faster Avg: 0.862350 -> 0.799479: 7.86% faster Significant (t=133.690989, a=0.95) Stddev: 0.00394 -> 0.00257: 52.92% smaller slowspitfire: Min: 0.955587 -> 0.909843: 5.03% faster Avg: 0.965960 -> 0.925845: 4.33% faster Significant (t=16.351067, a=0.95) Stddev: 0.01237 -> 0.02119: 41.63% larger slowunpickle: Min: 0.409312 -> 0.346982: 17.96% faster Avg: 0.412381 -> 0.349148: 18.11% faster Significant (t=242.889869, a=0.95) Stddev: 0.00198 -> 0.00169: 17.61% smaller startup_nosite: Min: 0.195620 -> 0.194328: 0.66% faster Avg: 0.230811 -> 0.238523: 3.23% slower Significant (t=-3.869944, a=0.95) Stddev: 0.01932 -> 0.02052: 5.87% larger threaded_count: Min: 0.222133 -> 0.133764: 66.06% faster Avg: 0.236670 -> 0.147750: 60.18% faster Significant (t=57.472693, a=0.95) Stddev: 0.01317 -> 0.00813: 61.98% smaller unpack_sequence: Min: 0.000129 -> 0.000119: 8.43% faster Avg: 0.000132 -> 0.000123: 7.22% faster Significant (t=24.614061, a=0.95) Stddev: 0.00003 -> 0.00011: 77.02% larger unpickle: Min: 1.191255 -> 1.149132: 3.67% faster Avg: 1.218023 -> 1.162351: 4.79% faster Significant (t=21.222711, a=0.95) Stddev: 0.02242 -> 0.01362: 64.54% smaller unpickle_list: Min: 0.880991 -> 0.965611: 8.76% slower Avg: 0.898949 -> 0.985231: 8.76% slower Significant (t=-17.387537, a=0.95) Stddev: 0.04838 -> 0.01103: 338.79% smaller
participants (31)
-
"Martin v. Löwis"
-
A.M. Kuchling
-
Aahz
-
Antoine Pitrou
-
Barry Warsaw
-
Brett Cannon
-
Cesare Di Mauro
-
Cesare Di Mauro
-
Chris Withers
-
Collin Winter
-
David Cournapeau
-
David Lyon
-
Eric Smith
-
Giuseppe Ottaviano
-
glyph@divmod.com
-
Guido van Rossum
-
James Y Knight
-
Jeroen Ruigrok van der Werven
-
Jesse Noller
-
M.-A. Lemburg
-
Mart Sõmermaa
-
Matthias Klose
-
Nick Coghlan
-
P.J. Eby
-
Paul Moore
-
R. David Murray
-
Stephen J. Turnbull
-
Tarek Ziadé
-
Tres Seaver
-
Zooko O'Whielacronx
-
Zooko Wilcox-O'Hearn