I've received some enthusiastic emails from someone who wants to
revive restricted mode. He started out with a bunch of patches to the
CPython runtime using ctypes, which he attached to an App Engine bug:
http://code.google.com/p/googleappengine/issues/detail?id=671
Based on his code (the file secure.py is all you need, included in
secure.tar.gz) it seems he believes the only security leaks are
__subclasses__, gi_frame and gi_code. (I have since convinced him that
if we add "restricted" guards to these attributes, he doesn't need the
functions added to sys.)
I don't recall the exploits that Samuele once posted that caused the
death of rexec.py -- does anyone recall, or have a pointer to the
threads?
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
Alright, I will re-submit with the contents pasted. I never use double
backquotes as I think them rather ugly; that is the work of an editor
or some automated program in the chain. Plus, it also messed up my
line formatting and now I have lines with one word on them... Anyway,
the contents of PEP 3145:
PEP: 3145
Title: Asynchronous I/O For subprocess.Popen
Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
Type: Standards Track
Content-Type: text/plain
Created: 04-Aug-2009
Python-Version: 3.2
Abstract:
In its present form, the subprocess.Popen implementation is prone to
dead-locking and blocking of the parent Python script while waiting on data
from the child process.
Motivation:
A search for "python asynchronous subprocess" will turn up numerous
accounts of people wanting to execute a child process and communicate with
it from time to time reading only the data that is available instead of
blocking to wait for the program to produce data [1] [2] [3]. The current
behavior of the subprocess module is that when a user sends or receives
data via the stdin, stderr and stdout file objects, dead locks are common
and documented [4] [5]. While communicate can be used to alleviate some of
the buffering issues, it will still cause the parent process to block while
attempting to read data when none is available to be read from the child
process.
Rationale:
There is a documented need for asynchronous, non-blocking functionality in
subprocess.Popen [6] [7] [2] [3]. Inclusion of the code would improve the
utility of the Python standard library that can be used on Unix based and
Windows builds of Python. Practically every I/O object in Python has a
file-like wrapper of some sort. Sockets already act as such and for
strings there is StringIO. Popen can be made to act like a file by simply
using the methods attached the the subprocess.Popen.stderr, stdout and
stdin file-like objects. But when using the read and write methods of
those options, you do not have the benefit of asynchronous I/O. In the
proposed solution the wrapper wraps the asynchronous methods to mimic a
file object.
Reference Implementation:
I have been maintaining a Google Code repository that contains all of my
changes including tests and documentation [9] as well as blog detailing
the problems I have come across in the development process [10].
I have been working on implementing non-blocking asynchronous I/O in the
subprocess.Popen module as well as a wrapper class for subprocess.Popen
that makes it so that an executed process can take the place of a file by
duplicating all of the methods and attributes that file objects have.
There are two base functions that have been added to the subprocess.Popen
class: Popen.send and Popen._recv, each with two separate implementations,
one for Windows and one for Unix based systems. The Windows
implementation uses ctypes to access the functions needed to control pipes
in the kernel 32 DLL in an asynchronous manner. On Unix based systems,
the Python interface for file control serves the same purpose. The
different implementations of Popen.send and Popen._recv have identical
arguments to make code that uses these functions work across multiple
platforms.
When calling the Popen._recv function, it requires the pipe name be
passed as an argument so there exists the Popen.recv function that passes
selects stdout as the pipe for Popen._recv by default. Popen.recv_err
selects stderr as the pipe by default. "Popen.recv" and "Popen.recv_err"
are much easier to read and understand than "Popen._recv('stdout' ..." and
"Popen._recv('stderr' ..." respectively.
Since the Popen._recv function does not wait on data to be produced
before returning a value, it may return empty bytes. Popen.asyncread
handles this issue by returning all data read over a given time
interval.
The ProcessIOWrapper class uses the asyncread and asyncwrite functions to
allow a process to act like a file so that there are no blocking issues
that can arise from using the stdout and stdin file objects produced from
a subprocess.Popen call.
References:
[1] [ python-Feature Requests-1191964 ] asynchronous Subprocess
http://mail.python.org/pipermail/python-bugs-list/2006-December/
036524.html
[2] Daily Life in an Ivory Basement : /feb-07/problems-with-subprocess
http://ivory.idyll.org/blog/feb-07/problems-with-subprocess
[3] How can I run an external command asynchronously from Python? - Stack
Overflow
http://stackoverflow.com/questions/636561/how-can-i-run-an-external-
command-asynchronously-from-python
[4] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.wait
[5] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.kill
[6] Issue 1191964: asynchronous Subprocess - Python tracker
http://bugs.python.org/issue1191964
[7] Module to allow Asynchronous subprocess use on Windows and Posix
platforms - ActiveState Code
http://code.activestate.com/recipes/440554/
[8] subprocess.rst - subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev/source/browse/doc/subprocess.rst?spec=s…
[9] subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev
[10] Python Subprocess Dev
http://subdev.blogspot.com/
Copyright:
This P.E.P. is licensed under the Open Publication License;
http://www.opencontent.org/openpub/.
On Tue, Sep 8, 2009 at 22:56, Benjamin Peterson <benjamin(a)python.org> wrote:
> 2009/9/7 Eric Pruitt <eric.pruitt(a)gmail.com>:
>> Hello all,
>>
>> I have been working on adding asynchronous I/O to the Python
>> subprocess module as part of my Google Summer of Code project. Now
>> that I have finished documenting and pruning the code, I present PEP
>> 3145 for its inclusion into the Python core code. Any and all feedback
>> on the PEP (http://www.python.org/dev/peps/pep-3145/) is appreciated.
>
> Hi Eric,
> One of the reasons you're not getting many response is that you've not
> pasted the contents of the PEP in this message. That makes it really
> easy for people to comment on various sections.
>
> BTW, it seems like you were trying to use reST formatting with the
> text PEP layout. Double backquotes only mean something in reST.
>
>
> --
> Regards,
> Benjamin
>
Hello everyone.
I see several problems with the two hex-conversion function pairs that
Python offers:
1. binascii.hexlify and binascii.unhexlify
2. bytes.fromhex and bytes.hex
Problem #1:
bytes.hex is not implemented, although it was specified in PEP 358.
This means there is no symmetrical function to accompany bytes.fromhex.
Problem #2:
Both pairs perform the same function, although The Zen Of Python suggests
that
"There should be one-- and preferably only one --obvious way to do it."
I do not understand why PEP 358 specified the bytes function pair although
it mentioned the binascii pair...
Problem #3:
bytes.fromhex may receive spaces in the input string, although
binascii.unhexlify may not.
I see no good reason for these two functions to have different features.
Problem #4:
binascii.unhexlify may receive both input types: strings or bytes, whereas
bytes.fromhex raises an exception when given a bytes parameter.
Again there is no reason for these functions to be different.
Problem #5:
binascii.hexlify returns a bytes type - although ideally, converting to hex
should
always return string types and converting from hex should always return
bytes.
IMO there is no meaning of bytes as an output of hexlify, since the output
is a
representation of other bytes.
This is also the suggested behavior of bytes.hex in PEP 358
Problems #4 and #5 call for a decision about the input and output of the
functions being discussed:
Option A : Strict input and output
unhexlify (and bytes.fromhex) may only receives string and may only return
bytes
hexlify (and bytes.hex) may only receives bytes and may only return strings
Option B : Robust input and strict output
unhexlify (and bytes.fromhex) may receive bytes and strings and may only
return bytes
hexlify (and bytes.hex) may receive bytes or strings and may only return
strings
Of course we may also consider a third option, which will allow the return
type of
all functions to be robust (perhaps specified in a keyword argument), but as
I wrote in
the description of problem #5, I see no sense in that.
Note that PEP 3137 describes: "... the more strict definitions of encoding
and decoding in
Python 3000: encoding always takes a Unicode string and returns a bytes
sequence, and decoding
always takes a bytes sequence and returns a Unicode string." - suggesting
option A.
To repeat problems #4 and #5, the current behavior does not match any
option:
* The return type of binascii.hexlify should be string, and this is not the
current behavior.
As for the input:
* Option A is not the current behavior because binascii.unhexlify may
receive both input types.
* Option B is not the current behavior because bytes.fromhex does not allow
bytes as input.
To fix these issues, three changes should be applied:
1. Deprecate bytes.fromhex. This fixes the following problems:
#4 (go with option B and remove the function that does not allow bytes
input)
#2 (the binascii functions will be the only way to "do it")
#1 (bytes.hex should not be implemented)
2. In order to keep the functionality that bytes.fromhex has over unhexlify,
the latter function should be able to handle spaces in its input (fix #3)
3. binascii.hexlify should return string as its return type (fix #5)
I propose the following PEP for inclusion to Python 3.1.
Please comment.
Regards,
Martin
Abstract
========
Namespace packages are a mechanism for splitting a single Python
package across multiple directories on disk. In current Python
versions, an algorithm to compute the packages __path__ must be
formulated. With the enhancement proposed here, the import machinery
itself will construct the list of directories that make up the
package.
Terminology
===========
Within this PEP, the term package refers to Python packages as defined
by Python's import statement. The term distribution refers to
separately installable sets of Python modules as stored in the Python
package index, and installed by distutils or setuptools. The term
vendor package refers to groups of files installed by an operating
system's packaging mechanism (e.g. Debian or Redhat packages install
on Linux systems).
The term portion refers to a set of files in a single directory (possibly
stored in a zip file) that contribute to a namespace package.
Namespace packages today
========================
Python currently provides the pkgutil.extend_path to denote a package as
a namespace package. The recommended way of using it is to put::
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
int the package's ``__init__.py``. Every distribution needs to provide
the same contents in its ``__init__.py``, so that extend_path is
invoked independent of which portion of the package gets imported
first. As a consequence, the package's ``__init__.py`` cannot
practically define any names as it depends on the order of the package
fragments on sys.path which portion is imported first. As a special
feature, extend_path reads files named ``*.pkg`` which allow to
declare additional portions.
setuptools provides a similar function pkg_resources.declare_namespace
that is used in the form::
import pkg_resources
pkg_resources.declare_namespace(__name__)
In the portion's __init__.py, no assignment to __path__ is necessary,
as declare_namespace modifies the package __path__ through sys.modules.
As a special feature, declare_namespace also supports zip files, and
registers the package name internally so that future additions to sys.path
by setuptools can properly add additional portions to each package.
setuptools allows declaring namespace packages in a distribution's
setup.py, so that distribution developers don't need to put the
magic __path__ modification into __init__.py themselves.
Rationale
=========
The current imperative approach to namespace packages has lead to
multiple slightly-incompatible mechanisms for providing namespace
packages. For example, pkgutil supports ``*.pkg`` files; setuptools
doesn't. Likewise, setuptools supports inspecting zip files, and
supports adding portions to its _namespace_packages variable, whereas
pkgutil doesn't.
In addition, the current approach causes problems for system vendors.
Vendor packages typically must not provide overlapping files, and an
attempt to install a vendor package that has a file already on disk
will fail or cause unpredictable behavior. As vendors might chose to
package distributions such that they will end up all in a single
directory for the namespace package, all portions would contribute
conflicting __init__.py files.
Specification
=============
Rather than using an imperative mechanism for importing packages, a
declarative approach is proposed here, as an extension to the existing
``*.pkg`` mechanism.
The import statement is extended so that it directly considers ``*.pkg``
files during import; a directory is considered a package if it either
contains a file named __init__.py, or a file whose name ends with
".pkg".
In addition, the format of the ``*.pkg`` file is extended: a line with
the single character ``*`` indicates that the entire sys.path will
be searched for portions of the namespace package at the time the
namespace packages is imported.
Importing a package will immediately compute the package's __path__;
the ``*.pkg`` files are not considered anymore after the initial import.
If a ``*.pkg`` package contains an asterisk, this asterisk is prepended
to the package's __path__ to indicate that the package is a namespace
package (and that thus further extensions to sys.path might also
want to extend __path__). At most one such asterisk gets prepended
to the path.
extend_path will be extended to recognize namespace packages according
to this PEP, and avoid adding directories twice to __path__.
No other change to the importing mechanism is made; searching
modules (including __init__.py) will continue to stop at the first
module encountered.
Discussion
==========
With the addition of ``*.pkg`` files to the import mechanism, namespace
packages can stop filling out the namespace package's __init__.py.
As a consequence, extend_path and declare_namespace become obsolete.
It is recommended that distributions put a file <distribution>.pkg
into their namespace packages, with a single asterisk. This allows
vendor packages to install multiple portions of namespace package
into a single directory, with no risk of overlapping files.
Namespace packages can start providing non-trivial __init__.py
implementations; to do so, it is recommended that a single distribution
provides a portion with just the namespace package's __init__.py
(and potentially other modules that belong to the namespace package
proper).
The mechanism is mostly compatible with the existing namespace
mechanisms. extend_path will be adjusted to this specification;
any other mechanism might cause portions to get added twice to
__path__.
Copyright
=========
This document has been placed in the public domain.
There's a lot of code already out there (in the standard library and
other places) that uses %-style formatting, when in Python 3.0 we
should be encouraging {}-style formatting. We should really provide
some sort of transition plan. Consider an example from the logging
docs:
logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
We'd like to support both this style as well as the following style:
logging.Formatter("{asctime} - {name} - {levelname} - {message}")
Perhaps we'd eventually deprecate the %-style formatting, but at least
in the intervening time, we'd have to support both. I see a few
possibilities:
* Add a new class, NewFormatter, which uses the {}-style formatting.
This is ugly because it makes the formatting we're trying to encourage
look like the alternative implementation instead of the standard one.
It also means we have to come up with new names for every API that
uses format strings.
* Have Formatter try to guess whether it got %-style formatting or
{}-style formatting, and then delegate to the appropriate one. I don't
know how accurately we can guess. We also end up still relying on both
formatting styles under the hood.
* Have Formatter convert all %-style formatting strings to {}-style
formatting strings (automatically). This still involves some guessing,
and involves some serious hacking to translate from one to the other
(maybe it wouldn't even always be possible?). But at least we'd only
be using {}-style formatting under the hood.
Thoughts?
Steve
--
Where did you get that preposterous hypothesis?
Did Steve tell you that?
--- The Hiphopopotamus
On Sun, Sep 27, 2009 at 8:41 PM, Benjamin Peterson <benjamin(a)python.org> wrote:
> 2009/9/27 Steven Bethard <steven.bethard(a)gmail.com>:
>> The first release where any real deprecation message would show up is
>> Python 3.4, more than 3 years away. If you think 3 years isn't long
>> enough for people to be over the Python 3 transition, let's stick in
>> another version in there and make it 4.5 years.
>
> So, why even bother deprecating it if nobody is going to see the warnings?
I feel like I'm repeating the PEP, but here it is again anyway. There
will be messages in the docs and pending deprecation warnings (which
don't show up by default but can be requested) starting in Python 2.7
and 3.2. Regular deprecation warnings wouldn't show up until Python
3.4, 3 years away. This compromise was intended exactly to address the
issue you brought up about people getting over the Python 3
transition.
Steve
--
Where did you get that preposterous hypothesis?
Did Steve tell you that?
--- The Hiphopopotamus
Dear Pythonistas:
This issue causes serious problems. Users occasionally get binaries built for a
compatible Linux and Python version but with a different UCS2-vs-UCS4 setting,
and those users get mysterious memory corruption errors which are hard to
diagnose. It is possible that these situations also open up security
vulnerabilities. A couple such instances are documented on
http://bugs.python.org/setuptools/issue78, but you can find more by googling.
I would like to get this problem fixed!
In order to help address this issue I sampled what UCS size is used by python
executables in the wild. I instrumented a few buildslaves that are
contributed by
various people to the Tahoe-LAFS project to print out their platform,
python version,
and sys.maxunicode. The full results are appended below. maxunicode: 1114111
means that python executable was configured with --enable-unicode=ucs4, and
maxunicode: 65535 means that python executable was configured with
--enable-unicode=ucs2 or just with --enable-unicode . The only
incompatibilities
that I found are because some packagers have deliberately set UCS4
configuration and other packagers have left the default setting.
In the three cases where someone configured python with UCS2, one of the three
is certainly an accident (a custom-built python executable on an Ubuntu server)
and the other two just use the default instead of specifically configuring ucs2
in their configurations of Python and I suspect that they don't know the
difference and that it was an accident that they built a Python incompatible
with other distributions of their operating system.
In sum, while it would be good to add the unicode setting to the platform's ABI
(as discussed in setuptools ticket #78), it would also be good to make
the default
value be UCS4 instead of UCS2. This would fix all three of the potential
incompatibilities that I found (listed below), and once we have proper inclusion
of the unicode setting in the ABI in order to prevent the memory corruption,
defaulting to UCS4 would increase the likelihood that a binary built on one
distribution would be usable on another.
I'm sure that someone can come up with a reason why UCS2 is better than UCS4,
but I'm also sure that the benefits of compatibility outweigh any benefits of
UCS2 encoding, and that the widespread use of UCS4 demonstrates that there is
nothing fatally wrong with it, and that people who really value UCS2 encoding
more than compatibility can choose that for themselves by explicitly
setting UCS2.
Let me restate that I am not suggesting taking away anyone's options, only
making the setting for people who don't specify default to the
compatible option.
Hm, I guess that means that it should default to UCS2 on Windows and Mac and
to UCS4 on Linux and Solaris.
Regards,
Zooko
Ubuntu 6.10 "edgy" i386: python: 2.4.4c1 (#2, Mar 7 2008, 03:03:38) [GCC 4.1.2
20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)], maxunicode: 1114111
Ubuntu 7.04 "feisty": python: 2.5.1 (r251:54863, Jul 31 2008, 22:53:39) [GCC
4.1.2 (Ubuntu 4.1.2-0ubuntu4)], maxunicode: 1114111
Ubuntu 7.10 "gutsy" i386: python: 2.5.1 (r251:54863, Jul 31 2008, 23:17:40)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)], maxunicode: 1114111
Ubuntu 8.04 "hardy" amd64: python: 2.5.2 (r252:60911, Jul 22 2009, 15:33:10)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)], maxunicode: 1114111
Ubuntu 8.04 "hardy" i386: *custom* python: 2.6 (r26:66714, Oct 2 2008,
13:40:28) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)], maxunicode: 65535
Ubuntu 8.04 "hardy" i386: python: 2.5.2 (r252:60911, Jul 22 2009, 15:35:03)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)], maxunicode: 1114111
Ubuntu 9.04 "jaunty" amd64: *custom* python: 2.6.2 (release26-maint, Apr 19
2009, 01:58:18) [GCC 4.3.3], maxunicode: 1114111
Debian 4.0 "etch" i386: python: 2.4.4 (#2, Oct 22 2008, 19:52:44) [GCC 4.1.2
20061115 (prerelease) (Debian 4.1.1-21)], maxunicode: 1114111
Debian 5.0 "lenny" i386: python: 2.5.2 (r252:60911, Jan 4 2009, 17:40:26) [GCC
4.3.2], maxunicode: 1114111
Debian 5.0 "lenny" amd64: python: 2.5.2 (r252:60911, Jan 4 2009, 21:59:32)
[GCC 4.3.2], maxunicode: 1114111
Debian 5.0 "lenny" armv5tel: python: 2.5.2 (r252:60911, Jan 5 2009, 02:00:00)
[GCC 4.3.2], maxunicode: 1114111
Debian unstable "squeeze/sid" i386: python: 2.5.4 (r254:67916, Feb 17 2009,
20:16:45) [GCC 4.3.3], maxunicode: 1114111
Fedora 11 "leonidas" amd64: python: 2.6 (r26:66714, Jul 4 2009, 17:37:13) [GCC
4.4.0 20090506 (Red Hat 4.4.0-4)], maxunicode: 1114111
ArchLinux: python: 2.6.2 (r262:71600, Jul 20 2009, 02:23:30) [GCC 4.4.0
20090630 (prerelease)], maxunicode: 65535
NetBSD 4: python: 2.5.2 (r252:60911, Mar 20 2009, 14:00:07) [GCC 4.1.2 20060628
prerelease (NetBSD nb2 20060711)], maxunicode: 65535
OpenSolaris SunOS-5.11-i86pc-i386-32bit: python: 2.4.4 (#1, Mar 10 2009,
09:35:36) [C], maxunicode: 65535
Nexenta NCP1 SunOS-5.11-i86pc-i386-32bit: python: 2.4.3 (#2, May 3 2006,
19:12:42) [GCC 4.0.3 (GNU_OpenSolaris 4.0.3-1nexenta4)], maxunicode: 1114111
Mac OS 10.6 "snow leopard" i386: python: 2.6.1 (r261:67515, Jul 7 2009,
23:51:51) [GCC 4.2.1 (Apple Inc. build 5646)], maxunicode: 65535
Mac OS 10.5 "leopard" i386: python: 2.5.1 (r251:54863, Feb 6 2009, 19:02:12)
[GCC 4.0.1 (Apple Inc. build 5465)], maxunicode: 65535
Mac OS 10.4 "tiger" *custom* python: 2.5.4 (release25-maint:72153M, Apr 30 2009,
12:28:20) [GCC 4.0.1 (Apple Computer, Inc. build 5367)], maxunicode: 65535
Cygwin CYGWIN_NT-5.1-1.5.25-0.156-4-2-i686-32bit-WindowsPE: python: 2.5.2
(r252:60911, Dec 2 2008, 09:26:14) [GCC 3.4.4 (cygming special, gdc 0.12,
using dmd 0.125)], maxunicode: 65535
Windows: python: 2.6.2 (r262:71600, Apr 21 2009, 15:05:37) [MSC v.1500 32 bit
(Intel)], maxunicode: 65535
Windows: python: 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit
(Intel)], maxunicode: 65535
Zooko O'Whielacronx wrote:
> Folks:
>
> I'm sorry, I think I didn't make my concern clear. My users, and lots
> of other users, are having a problem with incompatibility between
> Python binary extension modules. One way to improve the situation
> would be if the Python devs would use their "bully pulpit" -- their
> unique position as a source respected by all Linux distributions --
> and say "We recommend that Linux distributions use UCS4 for
> compatibility with one another". This would not abrogate anyone's
> ability to choose their preferred setting nor, as far as I can tell,
> would it interfere with the ongoing development of Python.
-1
Please note that we did not choose to ship Python as UCS4 binary
on Linux - the Linux distributions did.
The Python default is UCS2 for a good reason: it's a good trade-off
between memory consumption, functionality and performance.
As already mentioned, I also don't understand how the changing
the Python default on Linux would help your users in any way -
if you let distutils compile your extensions, it's automatically
going to use the right Unicode setting for you (as well as your
users).
Unfortunately, this automatic support doesn't help you when
shipping e.g. setuptools eggs, but this is a tool problem,
not one of Python: setuptools completely ignores the fact
that there are two ways to build Python.
I'd suggest you ask the tool maintainers to adjust their tools
to support the Python Unicode option.
> Here are the details:
>
> I'm the maintainer of several Python packages. I work hard to make it
> easy for users, even users who don't know anything about Python, to
> use my software. There have been many pain points in this process and
> I've spent a lot of time on it for about three years now working on
> packaging, including the tools such as setuptools and distutils and
> the new "distribute" tool. Python packaging has been improving during
> these years -- things are looking up.
>
> One of the remaining pain points is that I can distribute binaries of
> my Python extension modules for Windows or Mac, but if I distribute a
> binary Python extension module on Linux, then if the user has a
> different UCS2/UCS4 setting then they won't be able to use the
> extension module. The current de facto standard for Linux is UCS4 --
> it is used by Debian, Ubuntu, Fedora, RHEL, OpenSuSE, etc. etc.. The
> vast majority of Linux users in practice have UCS4, and most binary
> Python modules are compiled for UCS4.
>
> That means that a few folks will get left out. Those folks, from my
> experience, are people who built their python executable themselves
> without specifying an override for the default, and the smaller Linux
> distributions who insist on doing whatever upstream Python devs
> recommend instead of doing whatever the other Linux distros are doing.
> One of the data points that I reported was a Python interpreter that
> was built locally on an Ubuntu server. Since the person building it
> didn't know to override the default setting of --enable-unicode, he
> ended up with a Python interpreter built for UCS2, even though all the
> Python extension modules shipped by Ubuntu were built with UCS4.
People building their own Python version will usually also build
their own extensions, so I don't really believe that the above
scenario is very common.
Also note that Python will complain loudly when you try to load
a UCS2 extension in a UCS4 build and vice-versa. We've made sure
that any extension using the Python Unicode C API has to be built
for the same UCS version of Python. This is done by using different
names for the C APIs at the C level.
> These are not isolated incidents. The following google searches
> suggest that a number of people spend time trying to figure out why
> Python extension modules fail on their linux systems:
>
> http://www.google.com/search?q=PyUnicodeUCS4_FromUnicode+undefined+symbol
> http://www.google.com/search?q=+PyUnicodeUCS2_FromUnicode+undefined+symbol
> http://www.google.com/search?q=_PyUnicodeUCS2_AsDefaultEncodedString+undefi…
Perhaps we should add a FAQ entry for these linker errors
(which are caused by the mentioned C API changes to prevent
mixing UCS version) ?!
Here's a quick way to determine you Python Unicode build type:
python -c "import sys;print((sys.maxunicode<66000)and'UCS2'or'UCS4')"
Perhaps we should include this info as well as an 32/64-bit indicator
and the processor type in the Python startup line:
# python
Python 2.6 (r26:66714, Feb 3 2009, 20:49:49, UCS4, 64-bit, x86_64)
[GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
This would help users find the right binaries to install as
extension.
> Another data point is the Mandriva Linux distribution. It is probably
> much smaller than Debian, Ubuntu, or RedHat, but it is still one of
> the major, well-known distributions. I requested of the Python
> maintainer for Mandriva, Michael Scherer, that they switch from UCS2
> to UCS4 in order to reduce compatibility problems like these. His
> answer as I understood it was that it is best to follow the
> recommendations of the upstream Python devs by using the default
> setting instead of choosing a setting for himself.
Which is IMHO what all Linux distributions should have done.
Distributions should really not be put in charge of upstream
coding design decisions.
Regards,
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Sep 28 2009)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
Tue Jul 3 10:14:17 CEST 2007, Guido van Rossum wrote:
> On 6/30/07, Matt Chisholm <matt-python at theory.org> wrote:
> > I've created and submitted a new PEP proposing support for labels in
> > Python's break and continue statements. Georg Brandl has graciously
> > added it to the PEP list as PEP 3136:
> >
> > http://www.python.org/dev/peps/pep-3136/
>
> I think this is a good summary of various proposals that have been
> floated in the past, plus some new ones. As a PEP, it falls short
> because it doesn't pick a solution but merely offers a large menu of
> possible options. Also, there is nothing about implementation yet.
>
> However, I'm rejecting it on the basis that code so complicated to
> require this feature is very rare. In most cases there are existing
> work-arounds that produce clean code, for example using 'return'.
I agree that this feature will only serve as a quick hack and in many
cases it would be misused and ugly code will be the result. However,
it seems that there are some shortcuts that have sneaked into python
(I am a fairly new Python programmer, only since late 2.4, so don't
shoot me). The specific one of which I speak about is:
while_stmt ::= "while" expression ":" suite
["else" ":" suite]
for_stmt ::= "for" target_list "in" expression_list ":" suite
["else" ":" suite]
try1_stmt ::= "try" ":" suite
("except" [expression ["as" target]] ":" suite)+
["else" ":" suite]
["finally" ":" suite]
All these else's seem like shortcuts to me. I did find a use for them,
once I found out they existed, but I get butterflies whenever I do.
--
Hatem Nassrat
At 07:00 PM 9/23/2009 +0200, Tarek Ziadé wrote:
>While it's great to have Philipp being part of our distutils design
>discussions,
>for his experience, I am not concerned of not having him in this "internal
>consensus" since Setuptools is not maintained anymore.
>
>He said some months ago, he would work on a brand new setuptools
>version with zero
>dependency to distutils. But it's still vaporware (from his own
>words), and the previous version is unmaintained for a year, so it was
>forked.
Here's what actually happened, if anyone cares. Tarek and friends
announced a fork of setuptools. I reviewed the work and saw that --
for the most part -- I was happy with it, and opined as how I might
be willing to bless the the "package inquisition" team as official
maintainers of the 0.6 branch of setuptools, so that I could work on
the fun bits I've long planned for 0.7, but never felt free to start
on while there was so much still needing to be done on 0.6.
However, just as I mentioned this, and suggested an option for what I
could do that would be helpful to his Distribute 0.7 project as well
as various other tools (e.g. implementing some of Jim Fulton's
long-requested features for better modularization of setuptools),
Tarek accused me of somehow trying to undermine his plans.
In addition, it appears Tarek was also offended by my earlier
statement that there were only a few people in the Python community
who had *already* earned my implicit trust to not only hack on
setuptools unsupervised, but also to take over its *future* direction
and BDFL-ship. (For example, Jim Fulton and Ian Bicking.)
Tarek, however, appears to have taken this to mean that I personally
thought he was an incompetent programmer or something (when I
actually had no opinion one way or the other), and ever since he has
taken to levelling potshots like the above at me on a semi-regular basis.
I've tried to ignore this and play nice, because he is actually
working on this stuff and I am not. But it's hard for me to actually
give any help in practice, if Tarek is too busy projecting hidden
plots onto everything I say and do.
If you read Tarek's distutils-sig posts, it appears my
already-existing trust in Ian and Jim was not only a personal insult
to Tarek, but also a plot to ensure that nobody with any time to do
so would ever work on setuptools, just as my excitement about working
on setuptools again was a plot to steal thunder from his fork.
All I want is for good stuff to happen for setuptools users and
Python users in general, so I don't think all the suspicion and
backbiting is merited. I certainly don't appreciate it, and I would
like it to stop. It also isn't even relevant to the thread, since my
lack of work on setuptools says exactly zero about the merits or lack
thereof of Tarek's proposals for the distutils!
Hell, I *support* the bulk of Tarek's setup.cfg proposal, and don't
even object to him Pronouncing it or cutting off the discussion! My
only issue on Python-Dev was his inaccurate implication that it was a
SIG consensus rather than a pronouncement on it. There is and was no
need for any of this to get personal, and I have continually strived
to keep my posts here and distutils-sig civil, even when I didn't
feel like being civil in response to Tarek's jabs. I have in fact
bent over backwards to be *nice* to Tarek, because he seemed so damn
sensitive about everything. Apparently, however, this does not
actually help things. :-(