Hello everyone!
We have been encountering several deadlocks in a threaded Python
application which calls subprocess.Popen (i.e. fork()) in some of its
threads.
This has occurred on Python 2.4.1 on a 2.4.27 Linux kernel.
Preliminary analysis of the hang shows that the child process blocks
upon entering the execvp function, in which the import_lock is acquired
due to the following line:
def _ execvpe(file, args, env=None):
from errno import ENOENT, ENOTDIR
...
It is known that when forking from a pthreaded application, acquisition
attempts on locks which were already locked by other threads while
fork() was called will deadlock.
Due to these oddities we were wondering if it would be better to extract
the above import line from the execvpe call, to prevent lock
acquisition attempts in such cases.
Another workaround could be re-assigning a new lock to import_lock
(such a thing is done with the global interpreter lock) at PyOS_AfterFork or
pthread_atfork.
We'd appreciate any opinions you might have on the subject.
Thanks in advance,
Yair and Rotem
On Wed, 10 Nov 2004, John P Speno wrote:
Hi, sorry for the delayed response.
> While using subprocess (aka popen5), I came across one potential gotcha. I've had
> exceptions ending like this:
>
> File "test.py", line 5, in test
> cmd = popen5.Popen(args, stdout=PIPE)
> File "popen5.py", line 577, in __init__
> data = os.read(errpipe_read, 1048576) # Exceptions limited to 1 MB
> OSError: [Errno 4] Interrupted system call
>
> (on Solaris 9)
>
> Would it make sense for subprocess to use a more robust read() function
> which can handle these cases, i.e. when the parent's read on the pipe
> to the child's stderr is interrupted by a system call, and returns EINTR?
> I imagine it could catch EINTR and EAGAIN and retry the failed read().
I assume you are using signals in your application? The os.read above is
not the only system call that can fail with EINTR. subprocess.py is full
of other system calls that can fail, and I suspect that many other Python
modules are as well.
I've made a patch (attached) to subprocess.py (and test_subprocess.py)
that should guard against EINTR, but I haven't committed it yet. It's
quite large.
Are Python modules supposed to handle EINTR? Why not let the C code handle
this? Or, perhaps the signal module should provide a sigaction function,
so that users can use SA_RESTART.
Index: subprocess.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/subprocess.py,v
retrieving revision 1.8
diff -u -r1.8 subprocess.py
--- subprocess.py 7 Nov 2004 14:30:34 -0000 1.8
+++ subprocess.py 17 Nov 2004 19:42:30 -0000
@@ -888,6 +888,50 @@
pass
+ def _read_no_intr(self, fd, buffersize):
+ """Like os.read, but retries on EINTR"""
+ while True:
+ try:
+ return os.read(fd, buffersize)
+ except OSError, e:
+ if e.errno == errno.EINTR:
+ continue
+ else:
+ raise
+
+
+ def _read_all(self, fd, buffersize):
+ """Like os.read, but retries on EINTR, and reads until EOF"""
+ all = ""
+ while True:
+ data = self._read_no_intr(fd, buffersize)
+ all += data
+ if data == "":
+ return all
+
+
+ def _write_no_intr(self, fd, s):
+ """Like os.write, but retries on EINTR"""
+ while True:
+ try:
+ return os.write(fd, s)
+ except OSError, e:
+ if e.errno == errno.EINTR:
+ continue
+ else:
+ raise
+
+ def _waitpid_no_intr(self, pid, options):
+ """Like os.waitpid, but retries on EINTR"""
+ while True:
+ try:
+ return os.waitpid(pid, options)
+ except OSError, e:
+ if e.errno == errno.EINTR:
+ continue
+ else:
+ raise
+
def _execute_child(self, args, executable, preexec_fn, close_fds,
cwd, env, universal_newlines,
startupinfo, creationflags, shell,
@@ -963,7 +1007,7 @@
exc_value,
tb)
exc_value.child_traceback = ''.join(exc_lines)
- os.write(errpipe_write, pickle.dumps(exc_value))
+ self._write_no_intr(errpipe_write, pickle.dumps(exc_value))
# This exitcode won't be reported to applications, so it
# really doesn't matter what we return.
@@ -979,7 +1023,7 @@
os.close(errwrite)
# Wait for exec to fail or succeed; possibly raising exception
- data = os.read(errpipe_read, 1048576) # Exceptions limited to 1 MB
+ data = self._read_all(errpipe_read, 1048576) # Exceptions limited to 1 MB
os.close(errpipe_read)
if data != "":
child_exception = pickle.loads(data)
@@ -1003,7 +1047,7 @@
attribute."""
if self.returncode == None:
try:
- pid, sts = os.waitpid(self.pid, os.WNOHANG)
+ pid, sts = self._waitpid_no_intr(self.pid, os.WNOHANG)
if pid == self.pid:
self._handle_exitstatus(sts)
except os.error:
@@ -1015,7 +1059,7 @@
"""Wait for child process to terminate. Returns returncode
attribute."""
if self.returncode == None:
- pid, sts = os.waitpid(self.pid, 0)
+ pid, sts = self._waitpid_no_intr(self.pid, 0)
self._handle_exitstatus(sts)
return self.returncode
@@ -1049,27 +1093,33 @@
stderr = []
while read_set or write_set:
- rlist, wlist, xlist = select.select(read_set, write_set, [])
+ try:
+ rlist, wlist, xlist = select.select(read_set, write_set, [])
+ except select.error, e:
+ if e[0] == errno.EINTR:
+ continue
+ else:
+ raise
if self.stdin in wlist:
# When select has indicated that the file is writable,
# we can write up to PIPE_BUF bytes without risk
# blocking. POSIX defines PIPE_BUF >= 512
- bytes_written = os.write(self.stdin.fileno(), input[:512])
+ bytes_written = self._write_no_intr(self.stdin.fileno(), input[:512])
input = input[bytes_written:]
if not input:
self.stdin.close()
write_set.remove(self.stdin)
if self.stdout in rlist:
- data = os.read(self.stdout.fileno(), 1024)
+ data = self._read_no_intr(self.stdout.fileno(), 1024)
if data == "":
self.stdout.close()
read_set.remove(self.stdout)
stdout.append(data)
if self.stderr in rlist:
- data = os.read(self.stderr.fileno(), 1024)
+ data = self._read_no_intr(self.stderr.fileno(), 1024)
if data == "":
self.stderr.close()
read_set.remove(self.stderr)
Index: test/test_subprocess.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_subprocess.py,v
retrieving revision 1.14
diff -u -r1.14 test_subprocess.py
--- test/test_subprocess.py 12 Nov 2004 15:51:48 -0000 1.14
+++ test/test_subprocess.py 17 Nov 2004 19:42:30 -0000
@@ -7,6 +7,7 @@
import tempfile
import time
import re
+import errno
mswindows = (sys.platform == "win32")
@@ -35,6 +36,16 @@
fname = tempfile.mktemp()
return os.open(fname, os.O_RDWR|os.O_CREAT), fname
+ def read_no_intr(self, obj):
+ while True:
+ try:
+ return obj.read()
+ except IOError, e:
+ if e.errno == errno.EINTR:
+ continue
+ else:
+ raise
+
#
# Generic tests
#
@@ -123,7 +134,7 @@
p = subprocess.Popen([sys.executable, "-c",
'import sys; sys.stdout.write("orange")'],
stdout=subprocess.PIPE)
- self.assertEqual(p.stdout.read(), "orange")
+ self.assertEqual(self.read_no_intr(p.stdout), "orange")
def test_stdout_filedes(self):
# stdout is set to open file descriptor
@@ -151,7 +162,7 @@
p = subprocess.Popen([sys.executable, "-c",
'import sys; sys.stderr.write("strawberry")'],
stderr=subprocess.PIPE)
- self.assertEqual(remove_stderr_debug_decorations(p.stderr.read()),
+ self.assertEqual(remove_stderr_debug_decorations(self.read_no_intr(p.stderr)),
"strawberry")
def test_stderr_filedes(self):
@@ -186,7 +197,7 @@
'sys.stderr.write("orange")'],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
- output = p.stdout.read()
+ output = self.read_no_intr(p.stdout)
stripped = remove_stderr_debug_decorations(output)
self.assertEqual(stripped, "appleorange")
@@ -220,7 +231,7 @@
stdout=subprocess.PIPE,
cwd=tmpdir)
normcase = os.path.normcase
- self.assertEqual(normcase(p.stdout.read()), normcase(tmpdir))
+ self.assertEqual(normcase(self.read_no_intr(p.stdout)), normcase(tmpdir))
def test_env(self):
newenv = os.environ.copy()
@@ -230,7 +241,7 @@
'sys.stdout.write(os.getenv("FRUIT"))'],
stdout=subprocess.PIPE,
env=newenv)
- self.assertEqual(p.stdout.read(), "orange")
+ self.assertEqual(self.read_no_intr(p.stdout), "orange")
def test_communicate(self):
p = subprocess.Popen([sys.executable, "-c",
@@ -305,7 +316,8 @@
'sys.stdout.write("\\nline6");'],
stdout=subprocess.PIPE,
universal_newlines=1)
- stdout = p.stdout.read()
+
+ stdout = self.read_no_intr(p.stdout)
if hasattr(open, 'newlines'):
# Interpreter with universal newline support
self.assertEqual(stdout,
@@ -343,7 +355,7 @@
def test_no_leaking(self):
# Make sure we leak no resources
- max_handles = 1026 # too much for most UNIX systems
+ max_handles = 10 # too much for most UNIX systems
if mswindows:
max_handles = 65 # a full test is too slow on Windows
for i in range(max_handles):
@@ -424,7 +436,7 @@
'sys.stdout.write(os.getenv("FRUIT"))'],
stdout=subprocess.PIPE,
preexec_fn=lambda: os.putenv("FRUIT", "apple"))
- self.assertEqual(p.stdout.read(), "apple")
+ self.assertEqual(self.read_no_intr(p.stdout), "apple")
def test_args_string(self):
# args is a string
@@ -457,7 +469,7 @@
p = subprocess.Popen(["echo $FRUIT"], shell=1,
stdout=subprocess.PIPE,
env=newenv)
- self.assertEqual(p.stdout.read().strip(), "apple")
+ self.assertEqual(self.read_no_intr(p.stdout).strip(), "apple")
def test_shell_string(self):
# Run command through the shell (string)
@@ -466,7 +478,7 @@
p = subprocess.Popen("echo $FRUIT", shell=1,
stdout=subprocess.PIPE,
env=newenv)
- self.assertEqual(p.stdout.read().strip(), "apple")
+ self.assertEqual(self.read_no_intr(p.stdout).strip(), "apple")
def test_call_string(self):
# call() function with string argument on UNIX
@@ -525,7 +537,7 @@
p = subprocess.Popen(["set"], shell=1,
stdout=subprocess.PIPE,
env=newenv)
- self.assertNotEqual(p.stdout.read().find("physalis"), -1)
+ self.assertNotEqual(self.read_no_intr(p.stdout).find("physalis"), -1)
def test_shell_string(self):
# Run command through the shell (string)
@@ -534,7 +546,7 @@
p = subprocess.Popen("set", shell=1,
stdout=subprocess.PIPE,
env=newenv)
- self.assertNotEqual(p.stdout.read().find("physalis"), -1)
+ self.assertNotEqual(self.read_no_intr(p.stdout).find("physalis"), -1)
def test_call_string(self):
# call() function with string argument on Windows
/Peter Åstrand <astrand(a)lysator.liu.se>
During the PyCon sprint I tried to make BaseException accept only a single
argument and bind it to BaseException.message . I was successful (see the
p3yk_no_args_on_exc branch), but it was very painful to pull off as anyone
who sat around me the last three days of the sprint will tell you as they
had to listen to me curse incessantly.
Because of the pain that I went through in the transition and thus the
lessons learned, Guido and I discussed it and we think it would be best to
give up on forcing BaseException to accept only a single argument. I think
it is still doable, but requires a multi-release transition period and not
the one that 2.6 -> 3.0 is offering. And so Guido and I plan on deprecating
BaseException.message as its entire point in existence was to help
transition to what we are not going to have happen. =)
Now that means BaseException.message might hold the record for shortest
lived feature as it was only introduced in 2.5 and is now to be deprecated
in 2.6 and removed in 2.7/3.0. =)
Below is PEP 352, revised to reflect the removal of
BaseException.messageand for letting the 'args' attribute stay (along
with suggesting one should
only pass a single argument to BaseException). Basically the interface for
exceptions doesn't really change in 3.0 except for the removal of
__getitem__.
--------------------------------------------------------------------------
PEP: 352
Title: Required Superclass for Exceptions
Version: $Revision: 53592 $
Last-Modified: $Date: 2007-01-28 21:54:11 -0800 (Sun, 28 Jan 2007) $
Author: Brett Cannon <brett(a)python.org>
Guido van Rossum <guido(a)python.org>
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Oct-2005
Post-History:
Abstract
========
In Python 2.4 and before, any (classic) class can be raised as an
exception. The plan for 2.5 was to allow new-style classes, but this
makes the problem worse -- it would mean *any* class (or
instance) can be raised! This is a problem as it prevents any
guarantees from being made about the interface of exceptions.
This PEP proposes introducing a new superclass that all raised objects
must inherit from. Imposing the restriction will allow a standard
interface for exceptions to exist that can be relied upon. It also
leads to a known hierarchy for all exceptions to adhere to.
One might counter that requiring a specific base class for a
particular interface is unPythonic. However, in the specific case of
exceptions there's a good reason (which has generally been agreed to
on python-dev): requiring hierarchy helps code that wants to *catch*
exceptions by making it possible to catch *all* exceptions explicitly
by writing ``except BaseException:`` instead of
``except *:``. [#hierarchy-good]_
Introducing a new superclass for exceptions also gives us the chance
to rearrange the exception hierarchy slightly for the better. As it
currently stands, all exceptions in the built-in namespace inherit
from Exception. This is a problem since this includes two exceptions
(KeyboardInterrupt and SystemExit) that often need to be excepted from
the application's exception handling: the default behavior of shutting
the interpreter down without a traceback is usually more desirable than
whatever the application might do (with the possible exception of
applications that emulate Python's interactive command loop with
``>>>`` prompt). Changing it so that these two exceptions inherit
from the common superclass instead of Exception will make it easy for
people to write ``except`` clauses that are not overreaching and not
catch exceptions that should propagate up.
This PEP is based on previous work done for PEP 348 [#pep348]_.
Requiring a Common Superclass
=============================
This PEP proposes introducing a new exception named BaseException that
is a new-style class and has a single attribute, ``args``. Below
is the code as the exception will work in Python 3.0 (how it will
work in Python 2.x is covered in the `Transition Plan`_ section)::
class BaseException(object):
"""Superclass representing the base of the exception hierarchy.
Provides a 'message' attribute that contains either the single
argument to the constructor or the empty string. This attribute
is used in the string representation for the
exception. This is so that it provides the extra details in the
traceback.
"""
def __init__(self, *args):
"""Set the 'message' attribute'"""
self.args = args
def __str__(self):
"""Return the str of 'message'"""
if len(self.args) == 1:
return str(self.args[0])
else:
return str(self.args)
def __repr__(self):
return "%s(*%s)" % (self.__class__.__name__, repr(self.args))
No restriction is placed upon what may be passed in for ``args``
for backwards-compatibility reasons. In practice, though, only
a single string argument should be used. This keeps the string
representation of the exception to be a useful message about the
exception that is human-readable; this is why the ``__str__`` method
special-cases on length-1 ``args`` value. Including programmatic
information (e.g., an error code number) should be stored as a
separate attribute in a subclass.
The ``raise`` statement will be changed to require that any object
passed to it must inherit from BaseException. This will make sure
that all exceptions fall within a single hierarchy that is anchored at
BaseException [#hierarchy-good]_. This also guarantees a basic
interface that is inherited from BaseException. The change to
``raise`` will be enforced starting in Python 3.0 (see the `Transition
Plan`_ below).
With BaseException being the root of the exception hierarchy,
Exception will now inherit from it.
Exception Hierarchy Changes
===========================
With the exception hierarchy now even more important since it has a
basic root, a change to the existing hierarchy is called for. As it
stands now, if one wants to catch all exceptions that signal an error
*and* do not mean the interpreter should be allowed to exit, you must
specify all but two exceptions specifically in an ``except`` clause
or catch the two exceptions separately and then re-raise them and
have all other exceptions fall through to a bare ``except`` clause::
except (KeyboardInterrupt, SystemExit):
raise
except:
...
That is needlessly explicit. This PEP proposes moving
KeyboardInterrupt and SystemExit to inherit directly from
BaseException.
::
- BaseException
|- KeyboardInterrupt
|- SystemExit
|- Exception
|- (all other current built-in exceptions)
Doing this makes catching Exception more reasonable. It would catch
only exceptions that signify errors. Exceptions that signal that the
interpreter should exit will not be caught and thus be allowed to
propagate up and allow the interpreter to terminate.
KeyboardInterrupt has been moved since users typically expect an
application to exit when the press the interrupt key (usually Ctrl-C).
If people have overly broad ``except`` clauses the expected behaviour
does not occur.
SystemExit has been moved for similar reasons. Since the exception is
raised when ``sys.exit()`` is called the interpreter should normally
be allowed to terminate. Unfortunately overly broad ``except``
clauses can prevent the explicitly requested exit from occurring.
To make sure that people catch Exception most of the time, various
parts of the documentation and tutorials will need to be updated to
strongly suggest that Exception be what programmers want to use. Bare
``except`` clauses or catching BaseException directly should be
discouraged based on the fact that KeyboardInterrupt and SystemExit
almost always should be allowed to propagate up.
Transition Plan
===============
Since semantic changes to Python are being proposed, a transition plan
is needed. The goal is to end up with the new semantics being used in
Python 3.0 while providing a smooth transition for 2.x code. All
deprecations mentioned in the plan will lead to the removal of the
semantics starting in the version following the initial deprecation.
Here is BaseException as implemented in the 2.x series::
class BaseException(object):
"""Superclass representing the base of the exception hierarchy.
The __getitem__ method is provided for backwards-compatibility
and will be deprecated at some point.
"""
def __init__(self, *args):
"""Set the 'args' attribute."""
self.args = args
def __str__(self):
"""Return the str of args[0] or args, depending on length."""
return str(self.args[0]
if len(self.args) <= 1
else self.args)
def __repr__(self):
func_args = repr(self.args) if self.args else "()"
return self.__class__.__name__ + func_args
def __getitem__(self, index):
"""Index into arguments passed in during instantiation.
Provided for backwards-compatibility and will be
deprecated.
"""
return self.args[index]
Deprecation of features in Python 2.9 is optional. This is because it
is not known at this time if Python 2.9 (which is slated to be the
last version in the 2.x series) will actively deprecate features that
will not be in 3.0 . It is conceivable that no deprecation warnings
will be used in 2.9 since there could be such a difference between 2.9
and 3.0 that it would make 2.9 too "noisy" in terms of warnings. Thus
the proposed deprecation warnings for Python 2.9 will be revisited
when development of that version begins to determine if they are still
desired.
* Python 2.5 [done]
- all standard exceptions become new-style classes
- introduce BaseException
- Exception, KeyboardInterrupt, and SystemExit inherit from BaseException
- deprecate raising string exceptions
* Python 2.6
- deprecate catching string exceptions
- deprecate ``message`` attribute (see `Retracted Ideas`_)
* Python 2.7
- deprecate raising exceptions that do not inherit from BaseException
- remove ``message`` attribute
* Python 2.8
- deprecate catching exceptions that do not inherit from BaseException
* Python 2.9
- deprecate ``__getitem__`` (optional)
* Python 3.0 [done]
- drop everything that was deprecated above:
+ string exceptions (both raising and catching)
+ all exceptions must inherit from BaseException
+ drop ``__getitem__``
Retracted Ideas
===============
A previous version of this PEP that was implemented in Python 2.5
included a 'message' attribute on BaseException. Its purpose was to
begin a transition to BaseException accepting only a single argument.
This was to tighten the interface and to force people to use
attributes in subclasses to carry arbitrary information with an
exception instead of cramming it all into ``args``.
Unfortunately, while implementing the removal of the ``args``
attribute in Python 3.0 at the PyCon 2007 sprint
[#pycon2007-sprint-email]_, it was discovered that the transition was
very painful, especially for C extension modules. It was decided that
it would be better to deprecate the ``message`` attribute in
Python 2.6 (and remove in Python 2.7 and Python 3.0) and consider a
more long-term transition strategy in Python 3.0 to remove
multiple-argument support in BaseException in preference of accepting
only a single argument. Thus the introduction of ``message`` and the
original deprecation of ``args`` has been retracted.
References
==========
.. [#pep348] PEP 348 (Exception Reorganization for Python 3.0)
http://www.python.org/peps/pep-0348.html
.. [#hierarchy-good] python-dev Summary for 2004-08-01 through 2004-08-15
http://www.python.org/dev/summary/2004-08-01_2004-08-15.html#an-exception-i…
.. [#SF_1104669] SF patch #1104669 (new-style exceptions)
http://www.python.org/sf/1104669
.. [#pycon2007-sprint-email] python-3000 email ("How far to go with
cleaning up exceptions")
http://mail.python.org/pipermail/python-3000/2007-March/005911.html
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:
At 02:47 PM 2/24/2007 -0600, Tarek Ziadé wrote:
>I have created a setup.py file for distirbution and I bumped into
>a small bug when i tried to set my name in the contact field (Tarek Ziadé)
>
>Using string (utf8 file):
>
>setup(
> maintainer="Tarek Ziadé"
>)
>
>leads to:
>
> File ".../lib/python2.5/distutils/command/register.py", line 162, in
> send_metadata
> auth)
> File ".../lib/python2.5/distutils/command/register.py", line 257, in
> post_to_server
> value = unicode(value).encode("utf-8")
>UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 10:
>ordinal not in range(128)
>
>
>Using unicode:
>
>setup(
> maintainer=u"Tarek Ziadé"
>)
>
>leads to:
>
> File ".../lib/python2.5/distutils/dist.py", line 1094, in write_pkg_file
> file.write('Author: %s\n' % self.get_contact() )
>UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
>position 18: ordinal not in range(128)
>
>I would propose a patch for this problem but i don't know what would be
>the best input (i guess unicode
> for names)
At 05:45 PM 2/24/2007 -0500, Tres Seaver wrote:
>Don't you still need to tell Python about the encoding of your string
>literals [1] [2] ? E.g.::
That's not the problem, it's that the code that writes the PKG-INFO file
doesn't handle Unicode. See
distutils.dist.DistributionMetadata.write_pkg_info(). It needs to use a
file with encoding support, if it's doing unicode
However, there's currently no standard, as far as I know, for what encoding
the PKG-INFO file should use. Meanwhile, the 'register' command accepts
Unicode, but is broken in handling it.
Essentially, the problem is that Python 2.5 broke this by adding a unicode
*requirement* to the "register" command. Previously, register simply sent
whatever you gave it, and the PKG-INFO writing code still
does. Unfortunately, this means that there is no longer any one value that
you can use for your name that will be accepted by both "register" and
anything that writes a PKG-INFO file.
Both register and write_pkg_info() are arguably broken here, and should be
able to work with either strings or unicode, and degrade gracefully in the
event of non-ASCII characters in a string. (Because even though "register"
is only run by the package's author, users may run other commands that
require a PKG-INFO, so a package prepared using Python <2.5 must still be
usable with Python 2.5 distutils, and Python <2.5 allows 8-bit maintainer
names.)
Unfortunately, this isn't fixable until there's a new 2.5.x release. For
previous Python versions, both register and write_pkg_info() accepted 8-bit
strings and passed them on as-is, so the only workaround for this issue at
the moment is to revert to Python 2.4 or less.
This may seem like it's coming out of left field for a minute, but
bear with me.
There is no doubt that Ruby's success is a concern for anyone who
sees it as diminishing Python's status. One of the reasons for
Ruby's success is certainly the notion (originally advocated by Bruce
Tate, if I'm not mistaken) that it is the "next Java" -- the language
and environment that mainstream Java developers are, or will, look to
as a natural next step.
One thing that would help Python in this "debate" (or, perhaps simply
put it in the running, at least as a "next Java" candidate) would be
if Python had an easier migration path for Java developers that
currently rely upon various third-party libraries. The wealth of
third-party libraries available for Java has always been one of its
great strengths. Ergo, if Python had an easy-to-use, recommended way
to use those libraries within the Python environment, that would be a
significant advantage to present to Java developers and those who
would choose Ruby over Java. Platform compatibility is always a huge
motivator for those looking to migrate or upgrade.
In that vein, I would point to JPype (http://jpype.sourceforge.net).
JPype is a module that gives "python programs full access to java
class libraries". My suggestion would be to either:
(a) include JPype in the standard library, or barring that,
(b) make a very strong push to support JPype
(a) might be difficult or cumbersome technically, as JPype does need
to build against Java headers, which may or may not be possible given
the way that Python is distributed, etc.
However, (b) is very feasible. I can't really say what "supporting
JPype" means exactly -- maybe GvR and/or other heavyweights in the
Python community make public statements regarding its existence and
functionality, maybe JPype gets a strong mention or placement on
python.org....all those details are obviously not up to me, and I
don't know the workings of the "official" Python organizations enough
to make serious suggestions.
Regardless of the form of support, I think raising people's awareness
of JPype and what it adds to the Python environment would be a Good
Thing (tm).
For our part, we've used JPype to make PDFTextStream (our previously
Java-only PDF text extraction library) available and supported for
Python. You can read some about it here:
http://snowtide.com/PDFTextStream.Python
And I've blogged about how PDFTextStream.Python came about, and how
we worked with Steve Ménard, the maintainer of JPype, to make it all
happen (watch out for this URL wrapping):
http://blog.snowtide.com/2006/08/21/working-together-pythonjava-open-
sourcecommercial
Cheers,
Chas Emerick
Founder, Snowtide Informatics Systems
Enterprise-class PDF content extraction
cemerick(a)snowtide.com
http://snowtide.com | +1 413.519.6365
Hi all,
I hope the cross-post is appropriate.
I've started playing with getting the pywin32 extensions building under
the AMD64 architecture. I started building with Visual Studio 8 (it was
what I had handy) and I struck a few issues relating to the compiler version
that I thought worth sharing.
* In trying to build x64 from a 32bit VS7 (ie, cross-compiling via the
PCBuild directory), the python.exe project fails with:
pythoncore fatal error LNK1112: module machine type 'X86' conflicts with
target machine type 'AMD64'
is this a known issue, or am I doing something wrong?
* The PCBuild8 project files appear to work without modification (I only
tried native compilation here though, not a cross-compile) - however, unlike
the PCBuild directory, they place all binaries in a 'PCBuild8/x64'
directory. While this means that its possible to build for multiple
architectures from the same source tree, it makes life harder for tools like
'distutils' - eg, distutils already knows about the 'PCBuild' directory, but
it knows nothing about either PCBuild8 or PCBuild8/x64.
A number of other build processes also know to look inside a PCBuild
directory (eg, Mozilla), so instead of formalizing PCBuild8, I think we
should merge PCBuild8 into PCBuild. This could mean PCBuild/vs7 and
PCBuild/vs8 directories with the "project" files, but binaries would still
be generated in the 'PCBuild' (or PCBuild/x64) directory. This would mean
the same tree isn't capable of hosting 2 builds from different VS compilers,
but I think that is reasonable (if it's a problem, just have multiple source
directories). I understand that PCBuild8 is not "official", but in the
assumption that future versions of Python will use a compiler later than
VS7, it makes sense to me to clean this up now - what are others opinions on
this?
* Re the x64 directory used by the PCBuild8 process. IMO, it makes sense to
generate x64 binaries to their own directory - my expectation is that
cross-compiling between platforms is a reasonable use-case, and we should
support multiple achitectures for the same compiler version. This would
mean formalizing the x64 directory in both 'PCBuild' and distutils, and
leaving other external build processes to update as they support x64 builds.
Does this make sense? Would this fatally break other scripts used for
packaging (eg, the MSI framework)?
* Wide characters in VS8: PC/pyconfig.h defines PY_UNICODE_TYPE as 'unsigned
short', which corresponds with both 'WCHAR' and 'wchar' in previous compiler
versions. VS8 defines this as wchar_t, which I'm struggling to find a
formal definition for beyond being 2 bytes. My problem is that code which
assumes a 'Py_UNICODE *' could be used in place of a 'WCHAR *' now fails. I
believe the intent on Windows has always been "PyUNICODE == 'native
unicode'" - should PC/pyconfig.h reflect this (ie, should pyconfig.h grow a
version specific definition of PyUNICODE as wchar_t)?
* Finally, as something from left-field which may well take 12 months or
more to pull off - but would there be any interest to moving the Windows
build process to a cygwin environment based on the existing autoconf
scripts? I know a couple of projects are doing this successfully, including
Mozilla, so it has precendent. It does impose a greater burden on people
trying to build on Windows, but I'd suggest that in recent times, many
people who are likely to want to build Python on Windows are already likely
to have a cygwin environment. Simpler mingw builds and nuking MSVC specific
build stuff are among the advantages this would bring. It is not worth
adding this as "yet another windows build option" - so IMO it is only worth
progressing with if it became the "blessed" build process for windows - if
there is support for this, I'll work on it as the opportunity presents
itself...
I'm (obviously) only suggesting we do this on the trunk and am happy to make
all agreed changes - but I welcome all suggestions or critisisms of this
endeavour...
Cheers,
Mark
Phillip.eby wrote:
> Author: phillip.eby
> Date: Tue Apr 18 02:59:55 2006
> New Revision: 45510
>
> Modified:
> python/trunk/Lib/pkgutil.py
> python/trunk/Lib/pydoc.py
> Log:
> Second phase of refactoring for runpy, pkgutil, pydoc, and setuptools
> to share common PEP 302 support code, as described here:
>
> http://mail.python.org/pipermail/python-dev/2006-April/063724.html
Shouldn't this new module be named "pkglib" to be in line with
the naming scheme used for all the other utility modules, e.g. httplib,
imaplib, poplib, etc. ?
> pydoc now supports PEP 302 importers, by way of utility functions in
> pkgutil, such as 'walk_packages()'. It will properly document
> modules that are in zip files, and is backward compatible to Python
> 2.3 (setuptools installs for Python <2.5 will bundle it so pydoc
> doesn't break when used with eggs.)
Are you saying that the installation of setuptools in Python 2.3
and 2.4 will then overwrite the standard pydoc included with
those versions ?
I think that's the wrong way to go if not made an explicit
option in the installation process or a separate installation
altogether.
I bothered by the fact that installing setuptools actually changes
the standard Python installation by either overriding stdlib modules
or monkey-patching them at setuptools import time.
> What has not changed is that pydoc command line options do not support
> zip paths or other importer paths, and the webserver index does not
> support sys.meta_path. Those are probably okay as limitations.
>
> Tasks remaining: write docs and Misc/NEWS for pkgutil/pydoc changes,
> and update setuptools to use pkgutil wherever possible, then add it
> to the stdlib.
Add setuptools to the stdlib ? I'm still missing the PEP for this
along with the needed discussion touching among other things,
the change of the distutils standard "python setup.py install"
to install an egg instead of a site package.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Apr 18 2006)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
Hi,
as I wrote in my previous email, I'm currently porting Python to some more
unusual platforms, namely to a super computer
(http://www.research.ibm.com/bluegene/) and a tiny embedded operating system
(http://ecos.sourceware.org), which have more or less surprisingly quite
similar properties.
To do this, I added CMake files to Python, in order to use CMake to cross
compile Python to these platforms. CMake (http://www.cmake.org) is a
buildsystem in scope similar to autotools, but it's just one tool (instead of
a collection of tools) and it support Windows and the MS compilers as first
class citizens, i.e. it can not only generate Makefiles, but also project
files for the various versions of Visual Studio, and also for XCode.
Attached you can find the files I had to add to get this working. With these
CMake files I was able to build python for eCos, BlueGene, Linux and Windows
(with Visual Studio 2003, but here I simply reused the existing pyconfig.h,
because I didn't want to spend to much time with this).
So for Linux the configure checks should be already quite good and almost
complete, for eCos and BlueGene they also work (both are UNIX-like), for
Windows there is probably some tweaking required.
So if anybody is interested in trying to use CMake for Python, you can find
the files attached. Version 2.4.5 of CMake or newer is required.
I guess I should mention that I'm doing this currently with the released
Python 2.5.1.
Bye
Alex
I discovered what appears to be a thread-unsafety in uuid.py. This is
in the trunk as well as in 3.x; I'm using the trunk here for easy
reference. There's some code around like 395:
import ctypes, ctypes.util
_buffer = ctypes.create_string_buffer(16)
This creates a *global* buffer which is used as the output parameter
to later calls to _uuid_generate_random() and _uuid_generate_time().
For example, around line 481, in uuid1():
_uuid_generate_time(_buffer)
return UUID(bytes=_buffer.raw)
Clearly if two threads do this simultaneously they are overwriting
_buffer in unpredictable order. There are a few other occurrences of
this too.
I find it somewhat disturbing that what seems a fairly innocent
function that doesn't *appear* to have global state is nevertheless
not thread-safe. Would it be wise to fix this, e.g. by allocating a
fresh output buffer inside uuid1() and other callers?
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
I've been trying to install PIL 1.1.6 for several days now under Cygwin.
I have tried the Cygwin, Image and distutils lists with nary a pointer,
so I wondered whether python-dev might lead me to an answer (other than
"Stop using Cygwin" ;-)
Here's the output from a failed setup run with a couple of debug prints
inserted which should report how sub-process termination occurred - it
hangs after this output.
$ python setup.py build
running build
running build_py
running build_ext
building '_imaging' extension
SPAWN: ['gcc', '-fno-strict-aliasing', '-DNDEBUG', '-g', '-O3', '-Wall',
'-Wstrict-prototypes', '-DHAVE_LIBZ', '-IlibImaging', '-I/usr/include',
'-I/usr/include/python2.5', '-c', 'libImaging/Chops.c', '-o',
'build/temp.cygwin-1.5.24-i686-2.5/libImaging/Chops.o'] PATH? 1 V: 0 D:0
gcc -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes
-DHAVE_LIBZ -IlibImaging -I/usr/include -I/usr/include/python2.5 -c
libImaging/Chops.c -o build/temp.cygwin-1.5.24-i686-2.5/libImaging/Chops.o
Are we done yet? Waiting on pid 3280
I have to kill a python.exe process that's left running in the
background if I CTRL/C the build. Otherwise it will spend all night in a
tight loop. The compilation is finished when I kill the process, and the
next compile will begin if I repeat the command.
I extracted the _spawn_all function from distutils/spawn.py, hacked some
exceptions and ran it under command line control.
That worked fine with other subtasks, so I wondered whether this was a
failure specific to gcc, which would seem kind of unlikely. I ran the
same compile using my test function standing alone, seeing success:
sholden@bigboy ~/Imaging-1.1.6
$ python ~/Projects/Python/spawntest.py gcc -fno-strict-aliasing
-DNDEBUG -g -O3 -Wall -Wstrict-prototypes -DHAVE_LIBZ -IlibImaging
-I/usr/include -I/usr/include/python2.5 -c libImaging/Chops.c -o
build/temp.cygwin-1.5.24-i686-2.5/libImaging/Chops.o
Are we done yet? Waiting on pid 3244
Got pid, status 3244 0
Got WIFEXITED 0
So it appears unlikely to be gcc-specific, leaving me wondering what
exactly is the difference between the build environment and my tests.
It would be really nice if test_distutils showed any failures, but it
doesn't so any assistance would be welcome. At this point I can't even
replicate the failure in a simpler test :-(
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------