SOABI for Unicode ABI on 2.x (was: wheel 0.27.0 released)
On Feb 5, 2016 8:47 AM, "Nate Coraor"
[...]
- Add SOABI tags to platform-specific wheels built for Python 2.X (Pull Request #55, Issue #63, Issue #101)
I can't quite untangle all the documents linked from this PR, so let me ask here :-). Does this mean that python 2.x extension wheels now can and should declare whether they're assuming the 16- or 32-bit Unicode ABI inside the abi field? And if so, should PEP 513 be updated to allow for both options to be used with manylinux1? (Right not manylinux1 just implies/requires a UCS4 build, for older pythons where this matters.) -n
On Fri, Feb 5, 2016 at 12:46 PM, Nathaniel Smith
On Feb 5, 2016 8:47 AM, "Nate Coraor"
wrote: [...]
- Add SOABI tags to platform-specific wheels built for Python 2.X (Pull Request #55, Issue #63, Issue #101)
I can't quite untangle all the documents linked from this PR, so let me ask here :-). Does this mean that python 2.x extension wheels now can and should declare whether they're assuming the 16- or 32-bit Unicode ABI inside the abi field? And if so, should PEP 513 be updated to allow for both options to be used with manylinux1? (Right not manylinux1 just implies/requires a UCS4 build, for older pythons where this matters.)
-n
It isn't declared, wheel determines the ABI of the interpreter upon which the wheel is being built and tags it accordingly. So yes, I think a PEP 513 update is appropriate. As to whether the manylinux1 Docker images should include UCS-2 Pythons is a separate question, though. If there's interest, I can provide statistics of how many of Galaxy's UCS-2 Linux eggs were downloaded over time. --nate
On Feb 5, 2016 9:54 AM, "Nate Coraor"
On Fri, Feb 5, 2016 at 12:46 PM, Nathaniel Smith
wrote: On Feb 5, 2016 8:47 AM, "Nate Coraor"
wrote: [...]
- Add SOABI tags to platform-specific wheels built for Python 2.X
#55, Issue #63, Issue #101)
I can't quite untangle all the documents linked from this PR, so let me ask here :-). Does this mean that python 2.x extension wheels now can and should declare whether they're assuming the 16- or 32-bit Unicode ABI inside the abi field? And if so, should PEP 513 be updated to allow for both options to be used with manylinux1? (Right not manylinux1 just implies/requires a UCS4 build, for older pythons where this matters.)
-n
It isn't declared, wheel determines the ABI of the interpreter upon which
(Pull Request the wheel is being built and tags it accordingly. So yes, I think a PEP 513 update is appropriate. As to whether the manylinux1 Docker images should include UCS-2 Pythons is a separate question, though. If there's interest, I can provide statistics of how many of Galaxy's UCS-2 Linux eggs were downloaded over time. My assumption was that we should include the UCS2 option in the docker image so that we could build some wheels so that we could put them on pypi so that we could get some statistics on usage so that we could decide whether it was worth including in the docker image ;-). Anyway, yes, I at least would be interested in seeing these statistics :-) -n
On 6 February 2016 at 03:53, Nate Coraor
On Fri, Feb 5, 2016 at 12:46 PM, Nathaniel Smith
wrote: On Feb 5, 2016 8:47 AM, "Nate Coraor"
wrote: [...]
- Add SOABI tags to platform-specific wheels built for Python 2.X (Pull Request #55, Issue #63, Issue #101)
I can't quite untangle all the documents linked from this PR, so let me ask here :-). Does this mean that python 2.x extension wheels now can and should declare whether they're assuming the 16- or 32-bit Unicode ABI inside the abi field? And if so, should PEP 513 be updated to allow for both options to be used with manylinux1? (Right not manylinux1 just implies/requires a UCS4 build, for older pythons where this matters.)
It isn't declared, wheel determines the ABI of the interpreter upon which the wheel is being built and tags it accordingly. So yes, I think a PEP 513 update is appropriate.
+1 from me, since it's a genuine bug in the current specification.
As to whether the manylinux1 Docker images should include UCS-2 Pythons is a separate question, though. If there's interest, I can provide statistics of how many of Galaxy's UCS-2 Linux eggs were downloaded over time.
While I'd be interested in those stats, my initial inclination is to say "No" to including narrow Unicode runtimes in the build environment, as: 1. Python 2.7 narrow Unicode builds really don't handle code points >= 65,535 correctly 2. Python 3.3+ doesn't have the narrow/wide distinction 3. Canopy users will presumably be getting most of their binaries from Enthought, not PyPI That means the only folks that seem likely to miss out on pre-built binaries this way would be Python 2.7 pyenv users. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, Feb 06, 2016 at 03:04:48PM +1000, Nick Coghlan wrote:
On 6 February 2016 at 03:53, Nate Coraor
wrote: On Fri, Feb 5, 2016 at 12:46 PM, Nathaniel Smith
wrote: On Feb 5, 2016 8:47 AM, "Nate Coraor"
wrote: [...]
- Add SOABI tags to platform-specific wheels built for Python 2.X (Pull Request #55, Issue #63, Issue #101)
I can't quite untangle all the documents linked from this PR, so let me ask here :-). Does this mean that python 2.x extension wheels now can and should declare whether they're assuming the 16- or 32-bit Unicode ABI inside the abi field? And if so, should PEP 513 be updated to allow for both options to be used with manylinux1? (Right not manylinux1 just implies/requires a UCS4 build, for older pythons where this matters.)
It isn't declared, wheel determines the ABI of the interpreter upon which the wheel is being built and tags it accordingly. So yes, I think a PEP 513 update is appropriate.
+1 from me, since it's a genuine bug in the current specification.
As to whether the manylinux1 Docker images should include UCS-2 Pythons is a separate question, though. If there's interest, I can provide statistics of how many of Galaxy's UCS-2 Linux eggs were downloaded over time.
While I'd be interested in those stats, my initial inclination is to say "No" to including narrow Unicode runtimes in the build environment, as:
1. Python 2.7 narrow Unicode builds really don't handle code points >= 65,535 correctly 2. Python 3.3+ doesn't have the narrow/wide distinction 3. Canopy users will presumably be getting most of their binaries from Enthought, not PyPI
That means the only folks that seem likely to miss out on pre-built binaries this way would be Python 2.7 pyenv users.
And people who run build Python 2.7 with './configure && make && make install' Why does upstream Python default to UCS-2 builds on Linux anyway? FWIW the rationale Pyenv gave when they rejected a bug asking for UCS-4 builds by default was "we prefer to follow upstream defaults". Marius Gedminas -- Some people around here wouldn't recognize subtlety if it hit them on the head.
On 6 February 2016 at 20:35, Marius Gedminas
On Sat, Feb 06, 2016 at 03:04:48PM +1000, Nick Coghlan wrote:
That means the only folks that seem likely to miss out on pre-built binaries this way would be Python 2.7 pyenv users.
And people who run build Python 2.7 with './configure && make && make install'
If folks can handle building their own Python, handling building other projects isn't that much worse (although stumbling across FORTRAN dependencies may still be a surprise).
Why does upstream Python default to UCS-2 builds on Linux anyway?
That default long predates my time on the core development team, but my guess is that it was influenced by that also being the default for Windows and the JVM, before folks were really aware of the problems that arise when using UTF-16 as the internal encoding for working with code points outside the Basic Multilingual Plane. By the time that perspective changed, the fix was to eliminate the distinction (and significantly reduce the memory cost of correctness), rather than to just change the default.
FWIW the rationale Pyenv gave when they rejected a bug asking for UCS-4 builds by default was "we prefer to follow upstream defaults".
In this case, the old defaults are dubious, but the upstream fix eliminated the relevant setting. Historically, it didn't really matter, since very few people were building their own Python for Linux. However, if that was pyenv's only reason for rejecting a switch to wide unicode builds, it may be worth trying again, this time pointing them to PEP 513 and the wide-build default for Python 2.7 wheels in the manylinux build environment. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, Feb 06, 2016 at 09:18:39PM +1000, Nick Coghlan wrote:
On 6 February 2016 at 20:35, Marius Gedminas
wrote: FWIW the rationale Pyenv gave when they rejected a bug asking for UCS-4 builds by default was "we prefer to follow upstream defaults".
In this case, the old defaults are dubious, but the upstream fix eliminated the relevant setting. Historically, it didn't really matter, since very few people were building their own Python for Linux.
However, if that was pyenv's only reason for rejecting a switch to wide unicode builds, it may be worth trying again, this time pointing them to PEP 513 and the wide-build default for Python 2.7 wheels in the manylinux build environment.
Here's the issue, if you'd like to try: https://github.com/yyuu/pyenv/issues/257 (I don't use pyenv myself; all I know about this issue is from helping other people debug problems on IRC.) Marius Gedminas -- Tilton's Law of Lisp Programming: if you do not need a metaclass, do not use a metaclass.
On 6 February 2016 at 21:26, Marius Gedminas
On Sat, Feb 06, 2016 at 09:18:39PM +1000, Nick Coghlan wrote:
On 6 February 2016 at 20:35, Marius Gedminas
wrote: FWIW the rationale Pyenv gave when they rejected a bug asking for UCS-4 builds by default was "we prefer to follow upstream defaults".
In this case, the old defaults are dubious, but the upstream fix eliminated the relevant setting. Historically, it didn't really matter, since very few people were building their own Python for Linux.
However, if that was pyenv's only reason for rejecting a switch to wide unicode builds, it may be worth trying again, this time pointing them to PEP 513 and the wide-build default for Python 2.7 wheels in the manylinux build environment.
Here's the issue, if you'd like to try: https://github.com/yyuu/pyenv/issues/257
(I don't use pyenv myself; all I know about this issue is from helping other people debug problems on IRC.)
The issue has been reopened: https://github.com/yyuu/pyenv/issues/257#issuecomment-181076545 However, they're still going to have a potential compatibility problem to deal with, since extensions built against a narrow Python build won't run against a wide one. As such, putting a narrow Python 2.7 build into the build environment and encouraging folks creating Python 2.7 wheels to upload both variants may still be a preferable option. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
FWIW, we've seen a large shift in our userbase from UCS-2 to UCS-4 as Anaconda Python becomes the defacto Python2 interpreter in the sciences. We still ship both UCS-2 and UCS-4 as well. -Brian
On Feb 6, 2016, at 5:35 AM, Marius Gedminas
wrote: On Sat, Feb 06, 2016 at 03:04:48PM +1000, Nick Coghlan wrote:
On 6 February 2016 at 03:53, Nate Coraor
wrote: On Fri, Feb 5, 2016 at 12:46 PM, Nathaniel Smith
wrote: On Feb 5, 2016 8:47 AM, "Nate Coraor"
wrote: [...]
- Add SOABI tags to platform-specific wheels built for Python 2.X (Pull Request #55, Issue #63, Issue #101)
I can't quite untangle all the documents linked from this PR, so let me ask here :-). Does this mean that python 2.x extension wheels now can and should declare whether they're assuming the 16- or 32-bit Unicode ABI inside the abi field? And if so, should PEP 513 be updated to allow for both options to be used with manylinux1? (Right not manylinux1 just implies/requires a UCS4 build, for older pythons where this matters.)
It isn't declared, wheel determines the ABI of the interpreter upon which the wheel is being built and tags it accordingly. So yes, I think a PEP 513 update is appropriate.
+1 from me, since it's a genuine bug in the current specification.
As to whether the manylinux1 Docker images should include UCS-2 Pythons is a separate question, though. If there's interest, I can provide statistics of how many of Galaxy's UCS-2 Linux eggs were downloaded over time.
While I'd be interested in those stats, my initial inclination is to say "No" to including narrow Unicode runtimes in the build environment, as:
1. Python 2.7 narrow Unicode builds really don't handle code points >= 65,535 correctly 2. Python 3.3+ doesn't have the narrow/wide distinction 3. Canopy users will presumably be getting most of their binaries from Enthought, not PyPI
That means the only folks that seem likely to miss out on pre-built binaries this way would be Python 2.7 pyenv users.
And people who run build Python 2.7 with './configure && make && make install'
Why does upstream Python default to UCS-2 builds on Linux anyway?
FWIW the rationale Pyenv gave when they rejected a bug asking for UCS-4 builds by default was "we prefer to follow upstream defaults".
Marius Gedminas -- Some people around here wouldn't recognize subtlety if it hit them on the head. _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/06/2016 05:35 AM, Marius Gedminas wrote:
And people who run build Python 2.7 with './configure && make && make install'
Why does upstream Python default to UCS-2 builds on Linux anyway?
I don't recall if it had any bearing on the choice of default, but Long-running processes with large quantities of mostly-8-bit-compatible text strings in RAM (Zope, nearly any other Eurocentric webapp) need measurably less memory with UCS-2. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWt84xAAoJEPKpaDSJE9HY4KwP/jTTsxXJssYd/aPXkY3MSObU BI4BpBvgHcQnyCGJ7gUqjS7MtHcHb0iz1ch3xAZy+lz/kXhB5+Kd6q9mad5altAa RK9RK8+i4UcS4Mwd5KMKfXuaOygr/AyrZJ4C6vgNFgN1HKD3HhLtgJAwzeyk1HE+ 5ZN2XEVUVhYeTUXdP+qCea3SuPf2O0zADBat/ys8JQ0MMKPscm5acKE5uum3w1eJ 2nrF/8EP+LgZFw/3WNQON8tWKz9Iqwmqr4022jorOi6yq0OG/MAPzjNuSDZ6Ab9t klyDVbVVuFVdiPVhMd9viaoYJ5Q2DoFJG0jnt58B8L5N7M0wn4UTT/ZX5vvZJNoJ GqoavyWiFbLEu3+btlInkTioGYhNtwZKZnTH63Gjri2LAk5C4SmeD0vYiJMrHaCA ySGTLwmv/SiTNvKI0kVQ0DcJ3WP4mHherq0bB6UeNEcD1MVuvTfjM8MSelrmo4VC eJsvKfMcpZ0l3V5fX00AbE1TWTrz1DDojVzR2KH+uhUjzegZt0B68StOg3drxh94 f37Fs7CfenVsCGyThguZX/uZAtQulCDe/UNx/86cX+GuMNA5qifu8IIYb7UM/fIX Itn3fjYpjC5fhRFLiUKR3yuv9h1eckgefRYINGzB2d3bZRnkT0IsurSbg6uvt+UE ixNkskENFDuIQthyvCbQ =u3iS -----END PGP SIGNATURE-----
On 8 February 2016 at 09:07, Tres Seaver
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 02/06/2016 05:35 AM, Marius Gedminas wrote:
And people who run build Python 2.7 with './configure && make && make install'
Why does upstream Python default to UCS-2 builds on Linux anyway?
I don't recall if it had any bearing on the choice of default, but Long-running processes with large quantities of mostly-8-bit-compatible text strings in RAM (Zope, nearly any other Eurocentric webapp) need measurably less memory with UCS-2.
They can also end up being a bit faster as well, since most of their strings are smaller, and hence less data copying is needed. That's why Python 3 ended up switching to the combination of adaptive bit width sizing for str instances and non-contiguous storage for io.StringIO: individual strings use a bit width based on the largest code point they contain, while io.StringIO's non-contiguous storage means that if you avoid calling getvalue(), only the segments that actually contain higher code points need to use the higher bit widths. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (6)
-
Brian Cole
-
Marius Gedminas
-
Nate Coraor
-
Nathaniel Smith
-
Nick Coghlan
-
Tres Seaver