Today pypy and CPython's "setup.py bdist" generate the same filename but incompatible bdists. This makes it difficult to share both bdists in the same folder or index. Instead, they should generate different bdist filenames because one won't work with the other implementation. This PEP specifies a tagging system that includes enough information to decide whether a particular bdist is expected to work on a particular Python. Also at https://bitbucket.org/dholth/python-peps/raw/98cd36228c2e/pep-CTAG.txt Thanks for your feedback, Daniel Holth
Daniel Holth wrote:
Today pypy and CPython's "setup.py bdist" generate the same filename but incompatible bdists.
The distutils "bdist" command is just a generic command which then runs one of the more specific bdist_* commands (via the --formats option; defaulting to bdist_dumb). Since each of these produces different output files (installers, packages, eggs, etc), you should be more specific about which command you are referring to. Reading the PEP, I assume you'd like to change the bdist_dumb output file name only.
This makes it difficult to share both bdists in the same folder or index. Instead, they should generate different bdist filenames because one won't work with the other implementation. This PEP specifies a tagging system that includes enough information to decide whether a particular bdist is expected to work on a particular Python.
Also at https://bitbucket.org/dholth/python-peps/raw/98cd36228c2e/pep-CTAG.txt
Thanks for your feedback,
Daniel Holth
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 08 2012)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2012-08-25: FrOSCon, St. Augustin, Germany ... 17 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Wed, Aug 8, 2012 at 7:56 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Daniel Holth wrote:
Today pypy and CPython's "setup.py bdist" generate the same filename but incompatible bdists.
The distutils "bdist" command is just a generic command which then runs one of the more specific bdist_* commands (via the --formats option; defaulting to bdist_dumb).
Reading the PEP, I assume you'd like to change the bdist_dumb output file name only.
Yes, I do mean bdist_dumb in the Rationale. The PEP doesn't propose changing any file names. It is just a naming scheme. There is a new format "wheel" that needs this, but the naming scheme should be useful elsewhere, and I need feedback from the implementation communities to get it right.
On Aug 8, 2012 5:15 AM, "Daniel Holth" <dholth@gmail.com> wrote:
Today pypy and CPython's "setup.py bdist" generate the same filename but incompatible bdists. This makes it difficult to share both bdists in the same folder or index. Instead, they should generate different bdist filenames because one won't work with the other implementation. This PEP specifies a tagging system that includes enough information to decide whether a particular bdist is expected to work on a particular Python
Consider using sys.implementation to get name/version. The cache_tag should be particularly helpful. The 2-character approach for implementation names requires unnecessary curating. -eric
Also at
https://bitbucket.org/dholth/python-peps/raw/98cd36228c2e/pep-CTAG.txt
Thanks for your feedback,
Daniel Holth
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On Wed, Aug 8, 2012 at 9:31 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Consider using sys.implementation to get name/version. The cache_tag should be particularly helpful. The 2-character approach for implementation names requires unnecessary curating.
It will use that for the implementations not mentioned in the initial PEP.
I want to implement this all the way back to Python 2.5... On Wed, Aug 8, 2012 at 9:42 AM, Daniel Holth <dholth@gmail.com> wrote:
On Wed, Aug 8, 2012 at 9:31 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Consider using sys.implementation to get name/version. The cache_tag should be particularly helpful. The 2-character approach for implementation names requires unnecessary curating.
It will use that for the implementations not mentioned in the initial PEP.
A bit of background: Daniel's wheel project aims to provide the oft-requested feature of a cross-platform binary distribution format that can be cleanly and automatically mapped to the platform specific formats. In reviewing the draft format spec for wheels, I noted that it was worth getting broader agreement on the basic binary compatibility identification scheme early, since it doesn't need to be specific to the wheel format and will play a critical role in letting installers find the right binaries efficiently regardless of any other format details. For wheel in particular, aspects of this PEP will show up in various places in metadata, filenames and installer configuration settings. The PEP could probably use a "Background" section with some of the above info. Cheers, Nick. -- Sent from my phone, thus the relative brevity :)
Wheel also has a story at http://wheel.readthedocs.org/en/latest/story.html
Daniel Holth wrote:
Platform Tag ------------
The platform tag is simply `distutils.util.get_platform()` with all hyphens `-` and periods `.` replaced with underscore `_`.
This part is going to cause problems. distutils is good at identifying Linux and Windows and giving them sensible platform names, but it doesn't do a good job for other OSes. For e.g. FreeBSD it adds too much detail, for Mac OS X it doesn't have enough detail and it also has a tendency to change even for Python dot releases (esp. for Mac OS X which constantly causes problems). I think your naming scheme ought to focus more on the platform part, as the other parts (Python version and implementation) are well understood. For the platform, the installer would have to detect whether a package is compatible with the platform. This often requires intimate knowledge about the platform. Things to consider: * OS name * OS version, if that matters for compatibility * C lib version, if that matters for compatibility * ABI version, if that matters for compatibility * architecture (Intel, PowerPC, Sparc, ARM, etc) * bits (32, 64, 128, etc.) * fat builds which include multiple variants in a single archive and probably some more depending on OS. In some cases, a package will also have external requirements such as specific versions of a library (e.g. 0.9.8 vs. 1.0.0 OpenSSL library, or 2.2 vs. 2.3 unixODBC). These quickly get complicated up to the point where you need to run a script in order to determine whether a platform is compatible with the package or not. Putting all that information into a tag is going to be difficult, so an installer will either have to access more meta information about the package from some other resource than the file name (which is what PyPI is heading at), or download all variants that fit the target platform and then look inside the files for more meta information. So the tag name format will have to provide a set of basic "dimensions" for the platform (e.g. OS name, architecture, bits), but also needs to provide user defined additions that can be used to differentiate between all the extra variants which may be needed, and which can easily be parsed by a human with more background knowledge about the target system and his/her needs to select the right file. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 09 2012)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2012-08-25: FrOSCon, St. Augustin, Germany ... 16 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Aug 9, 2012 4:07 AM, "M.-A. Lemburg" <mal@egenix.com> wrote:
Daniel Holth wrote:
Platform Tag ------------
The platform tag is simply `distutils.util.get_platform()` with all hyphens `-` and periods `.` replaced with underscore `_`.
This part is going to cause problems. distutils is good at identifying Linux and Windows and giving them sensible platform names, but it doesn't do a good job for other OSes.
For e.g. FreeBSD it adds too much detail, for Mac OS X it doesn't have enough detail and it also has a tendency to change even for Python dot releases (esp. for Mac OS X which constantly causes problems).
egg does something a little more specific for OS X. I should probably copy that. egg is obviously a big influence on this work. I downloaded 862 eggs in May, 72 of which were platform-specific: Counter({'win32': 48, 'linux-x86_64': 9, 'macosx-10.6-fat': 8, 'linux-i686': 2, 'macosx-10.7-intel': 2, 'macosx-10.4-x86_64': 1, 'macosx-10.5-intel': 1, 'macosx-10.5-i386': 1}) I think your naming scheme ought to focus more on the platform
part, as the other parts (Python version and implementation) are well understood.
OK
For the platform, the installer would have to detect whether a package is compatible with the platform. This often requires intimate knowledge about the platform.
Things to consider:
* OS name * OS version, if that matters for compatibility * C lib version, if that matters for compatibility * ABI version, if that matters for compatibility * architecture (Intel, PowerPC, Sparc, ARM, etc) * bits (32, 64, 128, etc.) * fat builds which include multiple variants in a single archive
and probably some more depending on OS.
In some cases, a package will also have external requirements such as specific versions of a library (e.g. 0.9.8 vs. 1.0.0 OpenSSL library, or 2.2 vs. 2.3 unixODBC). These quickly get complicated up to the point where you need to run a script in order to determine
whether a platform is compatible with the package or not.
The external library requirements are out of scope for these tags. There is a suitable Metadata 1.2 tag for external requirements. Putting all that information into a tag is going to be difficult,
so an installer will either have to access more meta information about the package from some other resource than the file name (which is what PyPI is heading at), or download all variants that fit the target platform and then look inside the files for more meta information.
So the tag name format will have to provide a set of basic "dimensions" for the platform (e.g. OS name, architecture, bits), but also needs to provide user defined additions that can be used to differentiate between all the extra variants which may be needed, and which can easily be parsed by a human with more background knowledge about the target system and his/her needs to select the right file.
I don't want anyone to manually download packages. It just doesn't work when you have a lot of dependencies. I am interested in an 80% solution to this problem. Like the people who have uploaded eggs to pypi, I use Windows, Mac, and Linux. If someone can provide a good get_platform() for other platforms, great. I don't have that knowledge. Why don't I add the platform tag "local". Pre-built binary packages on pypi are most-necessary for Windows where it is hard to install the compiler, then Mac, and then Linux where you usually do have a compiler. If you are on a less common platform that always compiles everything from source anyway then you might compile a local cache of -local tagged binary packages. The tools will know not to upload these to pypi.
Re: http://www.python.org/dev/peps/pep-0425/ "Compatibility tags for built distributions" Progress towards a proper set of rules for generating the tags a Python implementation is likely to support. This system of being willing to install older built distributions is intended to solve the frustrating problem with eggs that you would have to build a new egg for each Python release, even for a pure-Python egg that probably runs fine on a newer Python. In order of preference the tags are: - built for the current implementation and its preferred ABI and architecture - for the current implementation and tagged with just the major version number (explicitly tagged as cross-version compatible) - for the current implementation, and any of the lesser minor revisions (cp26..cp20) - for the current language version (py27) - for the current language major version (py2) - for any of the current language minor versions (py26..py20) Importantly "py2" means "expected to work across minor releases" and is not shorthand for "py20". Practically it means the packager overrode the default tag. For PyPy, I think "pp19" for the current version makes more sense than "pp27" since they add important runtime features without changing the version of the Python language they support (like stackless emulation). I don't know how their versions will work when PyPy for Python 3 is released. Other Python implementations seem to follow the CPython version numbers more closely. For PyPy it may be appropriate to cross major versions when going back to generate the list of older packages one is willing to install. For CPython it is a bit overkill to go all the way back to Python 2.0; the "all the way back to the last major revision" rule is really for the 2 - 3 split. List of supported or "willing to install" tags for CPython 3.2 (an mu build): [('cp32', 'cp32mu', 'linux_x86_64'), ('cp3', 'none', 'any'), ('cp31', 'none', 'any'), ('cp30', 'none', 'any'), ('py32', 'none', 'any'), ('py3', 'none', 'any'), ('py31', 'none', 'any'), ('py30', 'none', 'any')] For CPython 2.7: [('cp27', 'none', 'linux_x86_64'), ('cp2', 'none', 'any'), ('cp26', 'none', 'any'), ('cp25', 'none', 'any'), ('cp24', 'none', 'any'), ('cp23', 'none', 'any'), ('cp22', 'none', 'any'), ('cp21', 'none', 'any'), ('cp20', 'none', 'any'), ('py27', 'none', 'any'), ('py2', 'none', 'any'), ('py26', 'none', 'any'), ('py25', 'none', 'any'), ('py24', 'none', 'any'), ('py23', 'none', 'any'), ('py22', 'none', 'any'), ('py21', 'none', 'any'), ('py20', 'none', 'any')]
On Sun, Sep 9, 2012 at 1:41 PM, Daniel Holth <dholth@gmail.com> wrote:
Re: http://www.python.org/dev/peps/pep-0425/ "Compatibility tags for built distributions"
Progress towards a proper set of rules for generating the tags a Python implementation is likely to support.
This system of being willing to install older built distributions is intended to solve the frustrating problem with eggs that you would have to build a new egg for each Python release, even for a pure-Python egg that probably runs fine on a newer Python.
Yep, those rules look sensible to me (and thanks for clarifying the intended semantics of the "py2" and "py3" version markers) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 9 September 2012 13:16, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sun, Sep 9, 2012 at 1:41 PM, Daniel Holth <dholth@gmail.com> wrote:
Re: http://www.python.org/dev/peps/pep-0425/ "Compatibility tags for built distributions"
Progress towards a proper set of rules for generating the tags a Python implementation is likely to support.
This system of being willing to install older built distributions is intended to solve the frustrating problem with eggs that you would have to build a new egg for each Python release, even for a pure-Python egg that probably runs fine on a newer Python.
Yep, those rules look sensible to me (and thanks for clarifying the intended semantics of the "py2" and "py3" version markers)
It's worth noting that there are two somewhat independent cases: Binary built distributions (containing C extensions, typically). This is architecture/ABI dependent, and would generally be tagged as cpXX-abi-arch. Pure Python built distributions. This is architecture/ABI independent, and would be tagged as pyXX-none-any or cpXX-none-any (or maybe cpX-none-any or pyX-none-any). (I'm ignoring other implementations from lack of knowledge, but I suspect a similar distinction will be relevant). Implementations will therefore *only* match built distributions which either: 1. Exactly match implversion-abi-arch (binary built distributions). There's a slight complication for implementations that support multiple ABIs, e.g. the stable ABI, but it's minor. 2. Match implversion in a "fuzzy" manner if abi-arch is none-any (pure python built distributions). The "fuzzy" matching is clearly defined, as an example for CPython 3.4, try (in this order of preference) cp34 cp3 cp33 cp32 cp31 cp30 py34 py3 py33 py32 py31 py30. [I wonder - should py34 be preferred over cp32? That's not what the wheel implementation does] On this basis, implementations should *not* declare none-any combinations, as they can be automatically deduced. One minor question on this, though, is the statement in the PEP "A user could instruct their installer to fall back to building from an sdist more or less often by configuring this list of tags". I don't see what this means - it should probably be either clarified or omitted. This isn't particularly a criticism of the PEP, it's just that the wording tends to obfuscate the essentials by hinting at complexities that don't really exist in practice. For example, given the above, the only really meaningful compressed tagset I can imagine is py2.py3-none-any. Apart from this one use case, which admittedly is important, the whole compressed tagset capability is unlikely to ever be needed. Paul.
It's worth noting that there are two somewhat independent cases:
Binary built distributions (containing C extensions, typically). This is architecture/ABI dependent, and would generally be tagged as cpXX-abi-arch. Pure Python built distributions. This is architecture/ABI independent, and would be tagged as pyXX-none-any or cpXX-none-any (or maybe cpX-none-any or pyX-none-any).
(I'm ignoring other implementations from lack of knowledge, but I suspect a similar distinction will be relevant).
Implementations will therefore *only* match built distributions which either:
1. Exactly match implversion-abi-arch (binary built distributions). There's a slight complication for implementations that support multiple ABIs, e.g. the stable ABI, but it's minor. 2. Match implversion in a "fuzzy" manner if abi-arch is none-any (pure python built distributions). The "fuzzy" matching is clearly defined, as an example for CPython 3.4, try (in this order of preference) cp34 cp3 cp33 cp32 cp31 cp30 py34 py3 py33 py32 py31 py30. [I wonder - should py34 be preferred over cp32? That's not what the wheel implementation does]
I don't think the exact order of the less-preferred options is critical, as long as you can make up your mind about whether you prefer packages with or without the C extension. Your Python is not likely to be compatible with competing py34 and cp32 wheels for the same version of a distribution. Most distributions will use either the cpXX style or the pyXX style tags, but not both.
On this basis, implementations should *not* declare none-any combinations, as they can be automatically deduced.
+0. algorithmically at least. It would not be wrong to dial down the "previous versions" logic quite a bit, too, as far as only doing cp33, cp3, py33, py3 which would mean "only use packages that are for our Python or explicitly cross-version".
One minor question on this, though, is the statement in the PEP "A user could instruct their installer to fall back to building from an sdist more or less often by configuring this list of tags". I don't see what this means - it should probably be either clarified or omitted.
In the above "fewer old tags by default" case, if you are on Python 3.3 and don't install cp32 by default, you could say "also install cp32 for this one package that I know works" by adding the cp32 tag to the list. This is to be compatible with lazy human packagers. Similarly when you are a version behind you will sometimes need to install packages built for the next version of Python. Or you could remove all binary package tags of the form *-abi3-linux_x86_64 from the list that your installer uses to consider whether to download a built package or an sdist from pypi. It would still download built pure-Python packages.
This isn't particularly a criticism of the PEP, it's just that the wording tends to obfuscate the essentials by hinting at complexities that don't really exist in practice. For example, given the above, the only really meaningful compressed tagset I can imagine is py2.py3-none-any. Apart from this one use case, which admittedly is important, the whole compressed tagset capability is unlikely to ever be needed.
Who knows. I imagined bundling a windows and a Linux dll in a single built package, or doing something with OS X fat binaries. If the shared library has the same __pycache__/name.tag.so style naming then this feature gets more interesting, if not difficult to package. It doesn't currently exist in practice.
On 25 September 2012 22:08, Daniel Holth <dholth@gmail.com> wrote:
I don't think the exact order of the less-preferred options is critical, as long as you can make up your mind about whether you prefer packages with or without the C extension. Your Python is not likely to be compatible with competing py34 and cp32 wheels for the same version of a distribution. Most distributions will use either the cpXX style or the pyXX style tags, but not both.
On this basis, implementations should *not* declare none-any combinations, as they can be automatically deduced.
+0. algorithmically at least. It would not be wrong to dial down the "previous versions" logic quite a bit, too, as far as only doing cp33, cp3, py33, py3 which would mean "only use packages that are for our Python or explicitly cross-version".
One minor question on this, though, is the statement in the PEP "A user could instruct their installer to fall back to building from an sdist more or less often by configuring this list of tags". I don't see what this means - it should probably be either clarified or omitted.
In the above "fewer old tags by default" case, if you are on Python 3.3 and don't install cp32 by default, you could say "also install cp32 for this one package that I know works" by adding the cp32 tag to the list. This is to be compatible with lazy human packagers. Similarly when you are a version behind you will sometimes need to install packages built for the next version of Python.
Or you could remove all binary package tags of the form *-abi3-linux_x86_64 from the list that your installer uses to consider whether to download a built package or an sdist from pypi. It would still download built pure-Python packages.
This isn't particularly a criticism of the PEP, it's just that the wording tends to obfuscate the essentials by hinting at complexities that don't really exist in practice. For example, given the above, the only really meaningful compressed tagset I can imagine is py2.py3-none-any. Apart from this one use case, which admittedly is important, the whole compressed tagset capability is unlikely to ever be needed.
Who knows. I imagined bundling a windows and a Linux dll in a single built package, or doing something with OS X fat binaries. If the shared library has the same __pycache__/name.tag.so style naming then this feature gets more interesting, if not difficult to package.
It doesn't currently exist in practice.
On 25 September 2012 22:08, Daniel Holth <dholth@gmail.com> wrote:
I don't think the exact order of the less-preferred options is critical, as long as you can make up your mind about whether you prefer packages with or without the C extension. Your Python is not likely to be compatible with competing py34 and cp32 wheels for the same version of a distribution. Most distributions will use either the cpXX style or the pyXX style tags, but not both.
I think that this is fine, but the PEP needs to be explicit. If it's a user option, the PEP should say "installers should allow the user to specify the list of compatibility tags, and the default should be XXX". If it's static, the PEP should say what it is. Having different installers make different, incompatible assumptions, is unpleasant. At present, of course, the only 2 real contenders are the reference wheel implementation and pip. Others like distutils2/packaging may follow. [...]
This isn't particularly a criticism of the PEP, it's just that the wording tends to obfuscate the essentials by hinting at complexities that don't really exist in practice. For example, given the above, the only really meaningful compressed tagset I can imagine is py2.py3-none-any. Apart from this one use case, which admittedly is important, the whole compressed tagset capability is unlikely to ever be needed.
Who knows. I imagined bundling a windows and a Linux dll in a single built package, or doing something with OS X fat binaries. If the shared library has the same __pycache__/name.tag.so style naming then this feature gets more interesting, if not difficult to package.
It doesn't currently exist in practice.
The PEP should stick to defining behaviour for things that do exist. Let those who build clever new options like that work out how to integrate with this PEP. (On which note, is the "stable ABI" real yet? On Windows, at least, it talks about a python3.dll, and yet there is no such thing distributed with Python 3.3, so based on that (what's the situation on Linux?) I'd be inclined to say that as of this point, even the stable ABI can be ignored. Paul.
I think that this is fine, but the PEP needs to be explicit. If it's a user option, the PEP should say "installers should allow the user to specify the list of compatibility tags, and the default should be XXX". If it's static, the PEP should say what it is.
Having different installers make different, incompatible assumptions, is unpleasant. At present, of course, the only 2 real contenders are the reference wheel implementation and pip. Others like distutils2/packaging may follow.
It might be easier to explain by defining a static list for each version of Python and then say "and you can add previous versions to the ordered set". Then for CPython 3.3, ignoring abi3, with pymalloc giving the cp33m suffix, you could have only cp33-cp33m-win32 cp33-none-win32 cp33-none-any py33-none-any py3-none-any implementation - preferred abi - plat implementation - none - plat implementation - none - any python major minor - none - any python major - none - any The rule for generating the last version's tags ignoring abi3 is that you only keep the none-any tags: cp32-none-any py32-none-any py3-none-any appending the lists without duplicates you get cp33-cp33m-win32 cp33-none-win32 cp33-none-any py33-none-any py3-none-any cp32-none-any py32-none-any I'm not sure what to do with abi3 or whether to use the cp3 (major only) implementation tag.
On 26 September 2012 13:31, Daniel Holth <dholth@gmail.com> wrote:
I think that this is fine, but the PEP needs to be explicit. If it's a user option, the PEP should say "installers should allow the user to specify the list of compatibility tags, and the default should be XXX". If it's static, the PEP should say what it is.
Having different installers make different, incompatible assumptions, is unpleasant. At present, of course, the only 2 real contenders are the reference wheel implementation and pip. Others like distutils2/packaging may follow.
It might be easier to explain by defining a static list for each version of Python and then say "and you can add previous versions to the ordered set". Then for CPython 3.3, ignoring abi3, with pymalloc giving the cp33m suffix, you could have only
cp33-cp33m-win32 cp33-none-win32 cp33-none-any py33-none-any py3-none-any
implementation - preferred abi - plat implementation - none - plat implementation - none - any python major minor - none - any python major - none - any
The rule for generating the last version's tags ignoring abi3 is that you only keep the none-any tags:
cp32-none-any py32-none-any py3-none-any
appending the lists without duplicates you get
cp33-cp33m-win32 cp33-none-win32 cp33-none-any py33-none-any py3-none-any cp32-none-any py32-none-any
I'm not sure what to do with abi3 or whether to use the cp3 (major only) implementation tag.
Win32 is not a good example here. As far as I know (I've experimented and read docs, but haven't analyzed the code), there is never a declared ABI on Win32. In fact, Windows is pretty much trivially simple: cpXY-none-win32 (for distributions with C extensions) pyXY-none-any (for pure-Python distributions) In fact, those two are the only values the bdist_wheel format can generate. Actually, for non-Windows, it's just as simple - bdist_wheel can only generate cpXY-ABI-PLAT (for distributions with C extensions) pyXY-none-any (for pure-Python distributions) ABI is the preferred ABI (the part of SOABI after the '-' from the Python used to build) and PLAT is the platform. (So essentially, Windows follows the standard pattern, but with an ABI of "none"). Eggs and wininst installers, if they used this convention, would be the same. As would bdist_msi, as far as I know. So the question is, what use case is there for anything more complicated than this? The only possibilities I can see are: 1. The stable ABI. At the moment, I don't know how well that's supported - I don't think the build tools detect whether code only uses the stable ABI, so they assume the full ABI. Users could claim to use the stable ABI by manual renaming. But without an agreed and documented convention for the stable ABI, they can't do that, so I think it's premature to worry about that case. It's easy enough to add if needed (see below - it's just another ABI for installers to allow) 2. UCS2 vs UCS4. This is dead for Python 3.3+, so not worth complicating the model for. 3. In theory, if a tool could create "fat" archives containing code for multiple platforms/ABIs, then that might need something more complex. But in the absence of any such tool, I'd call YAGNI on this. 4. Pure-python code which works on multiple versions of Python. This is a real case, and needs to be considered. Code that is (presumed) valid on all Python versions within the current major version can be manually retagged as pyX. And code that supports Python 2 and 3 can be retagged as py2.py3. To allow forward compatibility, installers should allow the user to install pyXZ code on Python version X.Y when Z<Y. But this should be a user option (possibly off by default) and an exact match should always be preferred. I'm not aware of any other cases that might matter here. The other implementations may well add further use cases - for example, PyPy can load (some) cPython code, I believe. But without details, let's wait to hear from them rather than speculating. The cases above where I suggest manual retagging may benefit from a UI in the build tools to automatically change the tags, but that's a quality of implementation issue. At a pinch, renaming a wheel file would work fine (as long as you didn't lie by doing so!) That covers the side of the proposal relating to how binary distributions declare what they were built for. As regards how installers should check whether packages are compatible, it seems to me that the rules can be reasonably simple. 1. The installer maintains a spec of what tagsets the current Python will support - that would be the exact implementation/version ("cpXY" or similar), a list of supported ABIs in preference order, and the current platform. The PEP should document how to get the list of supported ABIs, for completeness. 2. An exact match wins every time. Where there are multiple ABIs, the best match is based on the preference order supplied. 3. Non-exact matches can only occur for pure-Python packages (as platform-specific ones declare an exact version/abi/platform as noted above). Here, we ignore ABI and platform (they will always be none-any) and work down the list pyXY, pyX, pyXZ (Z<Y, this only if user allows it) in that order. Where a package declares multi-tags (py2.py3 is the only likely case) break ties by taking the package that specifies the fewest tags. That should be it. For me, the above summary (or something similar) needs to be in the PEP, to provide a proper background if nothing else. What do people think? Paul.
Win32 is not a good example here. As far as I know (I've experimented and read docs, but haven't analyzed the code), there is never a declared ABI on Win32. In fact, Windows is pretty much trivially simple:
cpXY-none-win32 (for distributions with C extensions) pyXY-none-any (for pure-Python distributions)
In fact, those two are the only values the bdist_wheel format can generate. Actually, for non-Windows, it's just as simple - bdist_wheel can only generate
bdist_wheel is incomplete too. It should read from setup.cfg for advice on the tags. Does win32 have debug / pymalloc builds? That is why there is a cp33dm ABI. On linux imp.get_suffixes() includes ('.cpython-33m.so', 'rb', 3), ('.abi3.so', 'rb', 3) and the abi tag is just an abbreviation cp33m or abi3.
cpXY-ABI-PLAT (for distributions with C extensions) pyXY-none-any (for pure-Python distributions)
These are the most important, and the ones bdist_wheel can (should) generate without configuration.
Eggs and wininst installers, if they used this convention, would be the same. As would bdist_msi, as far as I know. So the question is, what use case is there for anything more complicated than this? The only possibilities I can see are:
1. The stable ABI. At the moment, I don't know how well that's supported - I don't think the build tools detect whether code only uses the stable ABI, so they assume the full ABI. Users could claim to use the stable ABI by manual renaming. But without an agreed and documented convention for the stable ABI, they can't do that, so I think it's premature to worry about that case. It's easy enough to add if needed (see below - it's just another ABI for installers to allow)
2. UCS2 vs UCS4. This is dead for Python 3.3+, so not worth complicating the model for.
Python 2 continues to matter. I do not and can not use Python 3 commercially.
4. Pure-python code which works on multiple versions of Python. This is a real case, and needs to be considered. Code that is (presumed) valid on all Python versions within the current major version can be manually retagged as pyX. And code that supports Python 2 and 3 can be retagged as py2.py3. To allow forward compatibility, installers should allow the user to install pyXZ code on Python version X.Y when Z<Y. But this should be a user option (possibly off by default) and an exact match should always be preferred.
I'm not aware of any other cases that might matter here. The other implementations may well add further use cases - for example, PyPy can load (some) cPython code, I believe. But without details, let's wait to hear from them rather than speculating.
PyPy has source compatibility for some CPython extensions, so it counts as a different ABI. Sometimes code uses ctypes or cffi instead of the CPython ABI (or even includes an .exe that it calls with subprocess), there was some discussion about using the 'none' abi in that case .
2. An exact match wins every time. Where there are multiple ABIs, the best match is based on the preference order supplied.
Just let an exact match be the only kind of match. Then there is no parsing. The implementation tag is there because packages may have different requirements based on the implementation and version based on if: statements in setup.py. Maybe you use cp3 or py3 when you have added conditional requirements a-la Requries-Dist: argparse; python_version < 2.6 in PKG-INFO?
On 26 September 2012 15:53, Daniel Holth <dholth@gmail.com> wrote:
bdist_wheel is incomplete too. It should read from setup.cfg for advice on the tags.
I wasn't trying to imply that bdist_wheel was the reference, just that it was the best example "in the wild" that exists at the moment. Using setup.cfg to allow user configuration of the tags sounds reasonable.
Does win32 have debug / pymalloc builds? That is why there is a cp33dm ABI.
debug yes. That's represented in the DLL names (a _d suffix). I'm not sure about pymalloc. I don't know where the string "cp33dm" comes from, this is why I think the valid values should be documented in the PEP.
On linux imp.get_suffixes() includes
('.cpython-33m.so', 'rb', 3), ('.abi3.so', 'rb', 3)
and the abi tag is just an abbreviation cp33m or abi3.
On Windows, imp.get_suffixes shows:
imp.get_suffixes() [('.pyd', 'rb', 3), ('.py', 'U', 1), ('.pyw', 'U', 1), ('.pyc', 'rb', 2)]
I don't have a debug build to hand to check that, but Google tells me that Martin von Loewis said: if sys.executable.endswith("_d.exe"): print "Debug version" If relying on the executable name is too unsafe, you can also look at imp.get_suffixes(), which includes "_d.pyd" in a debug build on Windows.
2. UCS2 vs UCS4. This is dead for Python 3.3+, so not worth complicating the model for.
Python 2 continues to matter. I do not and can not use Python 3 commercially.
I don't dispute this, but I'm not sure how the PEP should reflect this. Regardless, if distinguishing UCS2 vs UCS4 matters, the PEP should clarify how to do so.
2. An exact match wins every time. Where there are multiple ABIs, the best match is based on the preference order supplied.
Just let an exact match be the only kind of match. Then there is no parsing.
I can see that argument, but to me it makes documenting (and understanding!) what an implementation/installer is saying when it lists the tags it will accept quite difficult. Maybe I'm just being dense :-)
The implementation tag is there because packages may have different requirements based on the implementation and version based on if: statements in setup.py. Maybe you use cp3 or py3 when you have added conditional requirements a-la Requries-Dist: argparse; python_version < 2.6 in PKG-INFO?
I'm sorry, that doesn't make any sense to me. Paul
On Wed, Sep 26, 2012 at 11:16 AM, Paul Moore <p.f.moore@gmail.com> wrote:
On 26 September 2012 15:53, Daniel Holth <dholth@gmail.com> wrote:
I don't dispute this, but I'm not sure how the PEP should reflect this. Regardless, if distinguishing UCS2 vs UCS4 matters, the PEP should clarify how to do so.
ABIs ending with u use UCS4, and the dmu suffixes always appear in that order. Should go into the pep. py27dmu py27du py27mu cp33dm cp33d cp33m
Just let an exact match be the only kind of match. Then there is no parsing.
I can see that argument, but to me it makes documenting (and understanding!) what an implementation/installer is saying when it lists the tags it will accept quite difficult. Maybe I'm just being dense :-)
Maybe we just need to attach a reference implementation to the PEP.
The implementation tag is there because packages may have different requirements based on the implementation and version based on if: statements in setup.py. Maybe you use cp3 or py3 when you have added conditional requirements a-la Requries-Dist: argparse; python_version < 2.6 in PKG-INFO?
I'm sorry, that doesn't make any sense to me.
When you use the py2 or py3 tags, it would ideally also communicate a promise "this code does not produce a different list of requirements based on the build Python".
On 9 September 2012 13:16, Nick Coghlan <ncoghlan@gmail.com> wrote:
Yep, those rules look sensible to me (and thanks for clarifying the intended semantics of the "py2" and "py3" version markers)
One (relatively minor) point: the Python tag isn't easily parseable. To split the implementation and version bits, you can do tag[:2], tag[2;} except for "Other Python implementations should use sys.implementation.name". Or you could use tag[:-2], tag[-2:] except for "py2". So you need to use a regex match to split off the trailing digits, which is a bit excessive. My current approach is the [:2], [2:] one, calling YAGNI on implementations not covered by the 2-letter codes... Paul.
Platform Tag ------------
The platform tag is simply `distutils.util.get_platform()` with all hyphens `-` and periods `.` replaced with underscore `_`.
This part is going to cause problems. distutils is good at identifying Linux and Windows and giving them sensible platform names, but it doesn't do a good job for other OSes.
http://www.python.org/dev/peps/pep-0425/ I've updated this part of the PEP with some examples of get_platform(), and have simplified the "list of supported tags" section by removing the "all older versions of Python with the same major version" logic from the PEP. It is still allowed, but it is not needed in the PEP. I would love to expound on the correct implementation of get_platform() for all major platforms. I do not know anything about the other platforms. A BSD and OSX expert will necessarily have to write that part of the specification. Daniel Holth (If you think the list of supported tags is long, go read about the Google spell-correct algorithm that pre-computes every spelling mistake for every word so it can tell you which correctly spelled word is the closest to your typo)
Daniel Holth wrote:
Platform Tag ------------
The platform tag is simply `distutils.util.get_platform()` with all hyphens `-` and periods `.` replaced with underscore `_`.
This part is going to cause problems. distutils is good at identifying Linux and Windows and giving them sensible platform names, but it doesn't do a good job for other OSes.
I still don't think that referencing the distutils function in the PEP is a good idea :-) It would be better to create a new helper.
I've updated this part of the PEP with some examples of get_platform(),
The string for x86 Linux platforms usually reads "linux-i686". Some other get_platform() examples: macosx-10.4-fat - Tiger, fat PPC/i386 build of Python macosx-10.6-x86_64 - Snow Leopard, x64-only build of Python freebsd-8.3-RELEASE-p3-i386 - FreeBSD 8.3, SP3, x86 freebsd-8.3-RELEASE-p3-amd64 - FreeBSD 8.3, SP3, x64 cygwin-1.7.9-i686 - Cygwin, x86 For Macs and other platforms that support fat builds it would be good to have some form which allows defining which architectures are included in the fat build, e.g. i386, ppc, x86_64. For FreeBSD, the string could be reduced to remove he "RELEASE-p3-" bit. It would probably be a good idea to develop a binary compatibility checker package on PyPI first before hard coding these things into the PEP. Such a package should offer two functions (sketching here): get_binary_platform_string() -> return a binary platform compatibility string for the current platform binary_package_compatible(platform_string) -> return True/False depending on whether the current platform is compatible with the given platform_string The package could then contain all the domain information needed for the various platforms.
and have simplified the "list of supported tags" section by removing the "all older versions of Python with the same major version" logic from the PEP. It is still allowed, but it is not needed in the PEP.
One note regarding adding more than one such tag to a file: adding those extra tags using dots (".") will make parsing the file name harder. It's probably better to separate them using a separator such as "_or_".
I would love to expound on the correct implementation of get_platform() for all major platforms. I do not know anything about the other platforms. A BSD and OSX expert will necessarily have to write that part of the specification.
See above :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 17 2012)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2012-10-29: PyCon DE 2012, Leipzig, Germany ... 42 days to go 2012-10-23: Python Meeting Duesseldorf ... 36 days to go 2012-08-28: Released mxODBC 3.2.0 ... http://egenix.com/go31 2012-08-20: Released mxODBC.Connect 2.0.0 ... http://egenix.com/go30 eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Mon, Sep 17, 2012 at 8:39 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Daniel Holth wrote:
Platform Tag ------------
The platform tag is simply `distutils.util.get_platform()` with all hyphens `-` and periods `.` replaced with underscore `_`.
This part is going to cause problems. distutils is good at identifying Linux and Windows and giving them sensible platform names, but it doesn't do a good job for other OSes.
I still don't think that referencing the distutils function in the PEP is a good idea :-) It would be better to create a new helper.
How about just sysconfig.get_platform()?
Daniel Holth wrote:
On Mon, Sep 17, 2012 at 8:39 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Daniel Holth wrote:
Platform Tag ------------
The platform tag is simply `distutils.util.get_platform()` with all hyphens `-` and periods `.` replaced with underscore `_`.
This part is going to cause problems. distutils is good at identifying Linux and Windows and giving them sensible platform names, but it doesn't do a good job for other OSes.
I still don't think that referencing the distutils function in the PEP is a good idea :-) It would be better to create a new helper.
How about just sysconfig.get_platform()?
That's essentially the same function :-) For some reason it's a copy of the one in distutils.util. I guess an oversight when sysconfig was created from various parts of distutils. Both functions are not suitable for the intended purpose, namely providing enough information to detect binary compatibility. Given that such information changes more often than we do Python releases and that this information is domain specific, I think it's better to maintain a pair of functions for creating such a platform string and detecting binary compatibility in a separate PyPI module which can then be pulled in by packaging and installer tools. The right place for the logic would be the platform module which was created in much the same way. Like with the above module, it was crowd-sourced to integrate domain specific knowledge. We added it to the stdlib after it stabilized. This is both good and bad. The good part is that it comes with Python automatically, the bad part that 3rd party code relying on it now has to deal with several different versions (for each Python release) and that platform changes are difficult to get into the module. This is why I think the PEP should just reference such a new module and leave the string format and binary compatibility check details to the module, rather than spell it out in the PEP. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 19 2012)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2012-10-29: PyCon DE 2012, Leipzig, Germany ... 40 days to go 2012-10-23: Python Meeting Duesseldorf ... 34 days to go 2012-09-18: Released mxODBC Zope DA 2.1.0 ... http://egenix.com/go32 2012-08-28: Released mxODBC 3.2.0 ... http://egenix.com/go31 eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Wed, Sep 19, 2012 at 2:40 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Daniel Holth wrote:
On Mon, Sep 17, 2012 at 8:39 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Daniel Holth wrote:
Platform Tag ------------
The platform tag is simply `distutils.util.get_platform()` with all hyphens `-` and periods `.` replaced with underscore `_`.
This part is going to cause problems. distutils is good at identifying Linux and Windows and giving them sensible platform names, but it doesn't do a good job for other OSes.
I still don't think that referencing the distutils function in the PEP is a good idea :-) It would be better to create a new helper.
How about just sysconfig.get_platform()?
That's essentially the same function :-) For some reason it's a copy of the one in distutils.util. I guess an oversight when sysconfig was created from various parts of distutils.
Both functions are not suitable for the intended purpose, namely providing enough information to detect binary compatibility.
Given that such information changes more often than we do Python releases and that this information is domain specific, I think it's better to maintain a pair of functions for creating such a platform string and detecting binary compatibility in a separate PyPI module which can then be pulled in by packaging and installer tools.
The right place for the logic would be the platform module which was created in much the same way. Like with the above module, it was crowd-sourced to integrate domain specific knowledge. We added it to the stdlib after it stabilized. This is both good and bad. The good part is that it comes with Python automatically, the bad part that 3rd party code relying on it now has to deal with several different versions (for each Python release) and that platform changes are difficult to get into the module.
This is why I think the PEP should just reference such a new module and leave the string format and binary compatibility check details to the module, rather than spell it out in the PEP.
The current implementation is at https://bitbucket.org/dholth/wheel In wheel/bdist_wheel.py and wheel/util.py
participants (5)
-
Daniel Holth
-
Eric Snow
-
M.-A. Lemburg
-
Nick Coghlan
-
Paul Moore