On Wed, 10 Nov 2004, John P Speno wrote:
Hi, sorry for the delayed response.
> While using subprocess (aka popen5), I came across one potential gotcha. I've had
> exceptions ending like this:
>
> File "test.py", line 5, in test
> cmd = popen5.Popen(args, stdout=PIPE)
> File "popen5.py", line 577, in __init__
> data = os.read(errpipe_read, 1048576) # Exceptions limited to 1 MB
> OSError: [Errno 4] Interrupted system call
>
> (on Solaris 9)
>
> Would it make sense for subprocess to use a more robust read() function
> which can handle these cases, i.e. when the parent's read on the pipe
> to the child's stderr is interrupted by a system call, and returns EINTR?
> I imagine it could catch EINTR and EAGAIN and retry the failed read().
I assume you are using signals in your application? The os.read above is
not the only system call that can fail with EINTR. subprocess.py is full
of other system calls that can fail, and I suspect that many other Python
modules are as well.
I've made a patch (attached) to subprocess.py (and test_subprocess.py)
that should guard against EINTR, but I haven't committed it yet. It's
quite large.
Are Python modules supposed to handle EINTR? Why not let the C code handle
this? Or, perhaps the signal module should provide a sigaction function,
so that users can use SA_RESTART.
Index: subprocess.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/subprocess.py,v
retrieving revision 1.8
diff -u -r1.8 subprocess.py
--- subprocess.py 7 Nov 2004 14:30:34 -0000 1.8
+++ subprocess.py 17 Nov 2004 19:42:30 -0000
@@ -888,6 +888,50 @@
pass
+ def _read_no_intr(self, fd, buffersize):
+ """Like os.read, but retries on EINTR"""
+ while True:
+ try:
+ return os.read(fd, buffersize)
+ except OSError, e:
+ if e.errno == errno.EINTR:
+ continue
+ else:
+ raise
+
+
+ def _read_all(self, fd, buffersize):
+ """Like os.read, but retries on EINTR, and reads until EOF"""
+ all = ""
+ while True:
+ data = self._read_no_intr(fd, buffersize)
+ all += data
+ if data == "":
+ return all
+
+
+ def _write_no_intr(self, fd, s):
+ """Like os.write, but retries on EINTR"""
+ while True:
+ try:
+ return os.write(fd, s)
+ except OSError, e:
+ if e.errno == errno.EINTR:
+ continue
+ else:
+ raise
+
+ def _waitpid_no_intr(self, pid, options):
+ """Like os.waitpid, but retries on EINTR"""
+ while True:
+ try:
+ return os.waitpid(pid, options)
+ except OSError, e:
+ if e.errno == errno.EINTR:
+ continue
+ else:
+ raise
+
def _execute_child(self, args, executable, preexec_fn, close_fds,
cwd, env, universal_newlines,
startupinfo, creationflags, shell,
@@ -963,7 +1007,7 @@
exc_value,
tb)
exc_value.child_traceback = ''.join(exc_lines)
- os.write(errpipe_write, pickle.dumps(exc_value))
+ self._write_no_intr(errpipe_write, pickle.dumps(exc_value))
# This exitcode won't be reported to applications, so it
# really doesn't matter what we return.
@@ -979,7 +1023,7 @@
os.close(errwrite)
# Wait for exec to fail or succeed; possibly raising exception
- data = os.read(errpipe_read, 1048576) # Exceptions limited to 1 MB
+ data = self._read_all(errpipe_read, 1048576) # Exceptions limited to 1 MB
os.close(errpipe_read)
if data != "":
child_exception = pickle.loads(data)
@@ -1003,7 +1047,7 @@
attribute."""
if self.returncode == None:
try:
- pid, sts = os.waitpid(self.pid, os.WNOHANG)
+ pid, sts = self._waitpid_no_intr(self.pid, os.WNOHANG)
if pid == self.pid:
self._handle_exitstatus(sts)
except os.error:
@@ -1015,7 +1059,7 @@
"""Wait for child process to terminate. Returns returncode
attribute."""
if self.returncode == None:
- pid, sts = os.waitpid(self.pid, 0)
+ pid, sts = self._waitpid_no_intr(self.pid, 0)
self._handle_exitstatus(sts)
return self.returncode
@@ -1049,27 +1093,33 @@
stderr = []
while read_set or write_set:
- rlist, wlist, xlist = select.select(read_set, write_set, [])
+ try:
+ rlist, wlist, xlist = select.select(read_set, write_set, [])
+ except select.error, e:
+ if e[0] == errno.EINTR:
+ continue
+ else:
+ raise
if self.stdin in wlist:
# When select has indicated that the file is writable,
# we can write up to PIPE_BUF bytes without risk
# blocking. POSIX defines PIPE_BUF >= 512
- bytes_written = os.write(self.stdin.fileno(), input[:512])
+ bytes_written = self._write_no_intr(self.stdin.fileno(), input[:512])
input = input[bytes_written:]
if not input:
self.stdin.close()
write_set.remove(self.stdin)
if self.stdout in rlist:
- data = os.read(self.stdout.fileno(), 1024)
+ data = self._read_no_intr(self.stdout.fileno(), 1024)
if data == "":
self.stdout.close()
read_set.remove(self.stdout)
stdout.append(data)
if self.stderr in rlist:
- data = os.read(self.stderr.fileno(), 1024)
+ data = self._read_no_intr(self.stderr.fileno(), 1024)
if data == "":
self.stderr.close()
read_set.remove(self.stderr)
Index: test/test_subprocess.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_subprocess.py,v
retrieving revision 1.14
diff -u -r1.14 test_subprocess.py
--- test/test_subprocess.py 12 Nov 2004 15:51:48 -0000 1.14
+++ test/test_subprocess.py 17 Nov 2004 19:42:30 -0000
@@ -7,6 +7,7 @@
import tempfile
import time
import re
+import errno
mswindows = (sys.platform == "win32")
@@ -35,6 +36,16 @@
fname = tempfile.mktemp()
return os.open(fname, os.O_RDWR|os.O_CREAT), fname
+ def read_no_intr(self, obj):
+ while True:
+ try:
+ return obj.read()
+ except IOError, e:
+ if e.errno == errno.EINTR:
+ continue
+ else:
+ raise
+
#
# Generic tests
#
@@ -123,7 +134,7 @@
p = subprocess.Popen([sys.executable, "-c",
'import sys; sys.stdout.write("orange")'],
stdout=subprocess.PIPE)
- self.assertEqual(p.stdout.read(), "orange")
+ self.assertEqual(self.read_no_intr(p.stdout), "orange")
def test_stdout_filedes(self):
# stdout is set to open file descriptor
@@ -151,7 +162,7 @@
p = subprocess.Popen([sys.executable, "-c",
'import sys; sys.stderr.write("strawberry")'],
stderr=subprocess.PIPE)
- self.assertEqual(remove_stderr_debug_decorations(p.stderr.read()),
+ self.assertEqual(remove_stderr_debug_decorations(self.read_no_intr(p.stderr)),
"strawberry")
def test_stderr_filedes(self):
@@ -186,7 +197,7 @@
'sys.stderr.write("orange")'],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
- output = p.stdout.read()
+ output = self.read_no_intr(p.stdout)
stripped = remove_stderr_debug_decorations(output)
self.assertEqual(stripped, "appleorange")
@@ -220,7 +231,7 @@
stdout=subprocess.PIPE,
cwd=tmpdir)
normcase = os.path.normcase
- self.assertEqual(normcase(p.stdout.read()), normcase(tmpdir))
+ self.assertEqual(normcase(self.read_no_intr(p.stdout)), normcase(tmpdir))
def test_env(self):
newenv = os.environ.copy()
@@ -230,7 +241,7 @@
'sys.stdout.write(os.getenv("FRUIT"))'],
stdout=subprocess.PIPE,
env=newenv)
- self.assertEqual(p.stdout.read(), "orange")
+ self.assertEqual(self.read_no_intr(p.stdout), "orange")
def test_communicate(self):
p = subprocess.Popen([sys.executable, "-c",
@@ -305,7 +316,8 @@
'sys.stdout.write("\\nline6");'],
stdout=subprocess.PIPE,
universal_newlines=1)
- stdout = p.stdout.read()
+
+ stdout = self.read_no_intr(p.stdout)
if hasattr(open, 'newlines'):
# Interpreter with universal newline support
self.assertEqual(stdout,
@@ -343,7 +355,7 @@
def test_no_leaking(self):
# Make sure we leak no resources
- max_handles = 1026 # too much for most UNIX systems
+ max_handles = 10 # too much for most UNIX systems
if mswindows:
max_handles = 65 # a full test is too slow on Windows
for i in range(max_handles):
@@ -424,7 +436,7 @@
'sys.stdout.write(os.getenv("FRUIT"))'],
stdout=subprocess.PIPE,
preexec_fn=lambda: os.putenv("FRUIT", "apple"))
- self.assertEqual(p.stdout.read(), "apple")
+ self.assertEqual(self.read_no_intr(p.stdout), "apple")
def test_args_string(self):
# args is a string
@@ -457,7 +469,7 @@
p = subprocess.Popen(["echo $FRUIT"], shell=1,
stdout=subprocess.PIPE,
env=newenv)
- self.assertEqual(p.stdout.read().strip(), "apple")
+ self.assertEqual(self.read_no_intr(p.stdout).strip(), "apple")
def test_shell_string(self):
# Run command through the shell (string)
@@ -466,7 +478,7 @@
p = subprocess.Popen("echo $FRUIT", shell=1,
stdout=subprocess.PIPE,
env=newenv)
- self.assertEqual(p.stdout.read().strip(), "apple")
+ self.assertEqual(self.read_no_intr(p.stdout).strip(), "apple")
def test_call_string(self):
# call() function with string argument on UNIX
@@ -525,7 +537,7 @@
p = subprocess.Popen(["set"], shell=1,
stdout=subprocess.PIPE,
env=newenv)
- self.assertNotEqual(p.stdout.read().find("physalis"), -1)
+ self.assertNotEqual(self.read_no_intr(p.stdout).find("physalis"), -1)
def test_shell_string(self):
# Run command through the shell (string)
@@ -534,7 +546,7 @@
p = subprocess.Popen("set", shell=1,
stdout=subprocess.PIPE,
env=newenv)
- self.assertNotEqual(p.stdout.read().find("physalis"), -1)
+ self.assertNotEqual(self.read_no_intr(p.stdout).find("physalis"), -1)
def test_call_string(self):
# call() function with string argument on Windows
/Peter Åstrand <astrand(a)lysator.liu.se>
Perhaps this is more approprate for python-list but I looks like a
bug to me. Example code:
class A:
def __str__(self):
return u'\u1234'
'%s' % u'\u1234' # this works
'%s' % A() # this doesn't work
It will work if 'A' subclasses from 'unicode' but should not be
necessary, IMHO. Any reason why this shouldn't be fixed?
Neil
Hello
I've looked at one bug and a bunch of patches and
added a comment to them:
(bug) [ 1102649 ] pickle files should be opened in binary mode
Added a comment about a possible different solution
[ 946207 ] Non-blocking Socket Server
Useless, what are the mixins for? Recommend close
[ 756021 ] Allow socket.inet_aton('255.255.255.255') on Windows
Looks good but added suggestion about when to test for special case
[ 740827 ] add urldecode() method to urllib
I think it's better to group these things into urlparse
[ 579435 ] Shadow Password Support Module
Would be nice to have, I recently just couldn't do the user
authentication that I wanted: based on the users' unix passwords
[ 1093468 ] socket leak in SocketServer
Trivial and looks harmless, but don't the sockets
get garbage collected once the request is done?
[ 1049151 ] adding bool support to xdrlib.py
Simple patch and 2.4 is out now, so...
It would be nice if somebody could have a look at my
own patches or help me a bit with them:
[ 1102879 ] Fix for 926423: socket timeouts + Ctrl-C don't play nice
[ 1103213 ] Adding the missing socket.recvall() method
[ 1103350 ] send/recv SEGMENT_SIZE should be used more in socketmodule
[ 1062014 ] fix for 764437 AF_UNIX socket special linux socket names
[ 1062060 ] fix for 1016880 urllib.urlretrieve silently truncates dwnld
Some of them come from the last Python Bug Day, see
http://www.python.org/moin/PythonBugDayStatus
Thank you !
Regards,
--Irmen de Jong
Hello,
just felt a little bored and tried to review a few (no-brainer) patches.
Here are the results:
* Patch #1051395
Minor fix in Lib/locale.py: Docs say that function _parse_localename
returns a tuple; but for one codepath it returns a list.
Patch fixes this by adding tuple(), recommending apply.
* Patch #1046831
Minor fix in Lib/distutils/sysconfig.py: it defines a function to
retrieve the Python version but does not use it everywhere; Patch
fixes this, recommending apply.
* Patch #751031
Adds recognizing JPEG-EXIF files (produced by digicams) to imghdr.py.
Recommending apply.
* Patch #712317
Fixes URL parsing in urlparse for URLs such as http://foo?bar. Splits
at '?', so assigns 'foo' to netloc and 'bar' to query instead of
assigning 'foo?bar' to netloc. Recommending apply.
regards,
Reinhold
__str__ and __unicode__ seem to behave differently. A __str__
overwrite in a str subclass is used when calling str(), a __unicode__
overwrite in a unicode subclass is *not* used when calling unicode():
-------------------------------
class str2(str):
def __str__(self):
return "foo"
x = str2("bar")
print str(x)
class unicode2(unicode):
def __unicode__(self):
return u"foo"
x = unicode2(u"bar")
print unicode(x)
-------------------------------
This outputs:
foo
bar
IMHO this should be fixed so that __unicode__() is used in the
second case too.
Bye,
Walter Dörwald
Hi.
[Mark Hammond]
> The point isn't about my suffering as such. The point is more that
> python-dev owns a tiny amount of the code out there, and I don't believe we
> should put Python's users through this.
>
> Sure - I would be happy to "upgrade" all the win32all code, no problem. I
> am also happy to live in the bleeding edge and take some pain that will
> cause.
>
> The issue is simply the user base, and giving Python a reputation of not
> being able to painlessly upgrade even dot revisions.
I agree with all this.
[As I imagined explicit syntax did not catch up and would require
lot of discussions.]
[GvR]
> > Another way is to use special rules
> > (similar to those for class defs), e.g. having
> >
> > <frag>
> > y=3
> > def f():
> > exec "y=2"
> > def g():
> > return y
> > return g()
> >
> > print f()
> > </frag>
> >
> > # print 3.
> >
> > Is that confusing for users? maybe they will more naturally expect 2
> > as outcome (given nested scopes).
>
> This seems the best compromise to me. It will lead to the least
> broken code, because this is the behavior that we had before nested
> scopes! It is also quite easy to implement given the current
> implementation, I believe.
>
> Maybe we could introduce a warning rather than an error for this
> situation though, because even if this behavior is clearly documented,
> it will still be confusing to some, so it is better if we outlaw it in
> some future version.
>
Yes this can be easy to implement but more confusing situations can arise:
<frag>
y=3
def f():
y=9
exec "y=2"
def g():
return y
return y,g()
print f()
</frag>
What should this print? the situation leads not to a canonical solution
as class def scopes.
or
<frag>
def f():
from foo import *
def g():
return y
return g()
print f()
</frag>
[Mark Hammond]
> > This probably won't be a very popular suggestion, but how about pulling
> > nested scopes (I assume they are at the root of the problem)
> > until this can be solved cleanly?
>
> Agreed. While I think nested scopes are kinda cool, I have lived without
> them, and really without missing them, for years. At the moment the cure
> appears worse then the symptoms in at least a few cases. If nothing else,
> it compromises the elegant simplicity of Python that drew me here in the
> first place!
>
> Assuming that people really _do_ want this feature, IMO the bar should be
> raised so there are _zero_ backward compatibility issues.
I don't say anything about pulling nested scopes (I don't think my opinion
can change things in this respect)
but I should insist that without explicit syntax IMO raising the bar
has a too high impl cost (both performance and complexity) or creates
confusion.
[Andrew Kuchling]
> >Assuming that people really _do_ want this feature, IMO the bar should be
> >raised so there are _zero_ backward compatibility issues.
>
> Even at the cost of additional implementation complexity? At the cost
> of having to learn "scopes are nested, unless you do these two things
> in which case they're not"?
>
> Let's not waffle. If nested scopes are worth doing, they're worth
> breaking code. Either leave exec and from..import illegal, or back
> out nested scopes, or think of some better solution, but let's not
> introduce complicated backward compatibility hacks.
IMO breaking code would be ok if we issue warnings today and implement
nested scopes issuing errors tomorrow. But this is simply a statement
about principles and raised impression.
IMO import * in an inner scope should end up being an error,
not sure about 'exec's.
We will need a final BDFL statement.
regards, Samuele Pedroni.
Phillip J. Eby wrote (in
http://mail.python.org/pipermail/python-dev/2005-January/050854.html)
> * Classic class support is a must; exceptions are still required to be
> classic, and even if they weren't in 2.5, backward compatibility should be
> provided for at least one release.
The base of the Exception hierarchy happens to be a classic class.
But why are they "required" to be classic?
More to the point, is this a bug, a missing feature, or just a bug in
the documentation for not mentioning the restriction?
You can inherit from both Exception and object. (Though it turns out
you can't raise the result.) My first try with google failed to produce an
explanation -- and I'm still not sure I understand, beyond "it doesn't
happen to work at the moment." Neither the documentation nor the
tutorial mention this restriction.
http://docs.python.org/lib/module-exceptions.htmlhttp://docs.python.org/tut/node10.html#SECTION0010500000000000000000
I didn't find any references to this restriction in exception.c. I did find
some code implying this in errors.c and ceval.c, but that wouldn't have
caught my eye if I weren't specifically looking for it *after* having just
read the discussion about (rejected) PEP 317.
-jJ
A couple months ago I proposed (maybe in a SF bug report) that
time.strptime() grow some way to parse time strings containing fractional
seconds based on my experience with the logging module. I've hit that
stumbling block again, this time in parsing files with timestamps that were
generated using datetime.time objects. I hacked around it again (in
miserable fashion), but I really think this shortcoming should be addressed.
A couple possibilities come to mind:
1. Extend the %S format token to accept simple decimals that match
the re pattern "[0-9]+(?:\.[0-9]+)".
2. Add a new token that accepts decimals as above to avoid overloading
the meaning of %S.
3. Add a token that matches integers corresponding to fractional parts.
The Perl DateTime module uses %N to match nanoseconds (wanna bet that
was added by a physicist?). Arbitrary other units can be specified
by sticking a number between the "%" and the "N". I didn't see an
example, but I presume "%6N" would match integers that are
interpreted as microseconds.
The advantage of the third choice is that you can use anything as the
"decimal" point. The logging module separates seconds from their fractional
part with a comma for some reason. (I live in the USofA where decimal
points are usually represented by a period. I would be in favor of
replacing the comma with a locale-specific decimal point in a future version
of the logging module.) I'm not sure I like the optional exponent thing in
Perl's DateTime module but it does make it easy to interpret integers
representing fractions of a second when they occur without a decimal point
to tell you where it is.
I'm open to suggestions and will be happy to implement whatever is agreed
to.
Skip
I thought it would be nice to try to improve the mimetypes module by having
it, on Windows, query the Registry to get the mapping of filename extensions
to media types, since the mimetypes code currently just blindly checks
posix-specific paths for httpd-style mapping files. However, it seems that the
way to get mappings from the Windows registry is excessively slow in Python.
I'm told that the reason has to do with the limited subset of APIs that are
exposed in the _winreg module. I think it is that EnumKey(key, index) is
querying for the entire list of subkeys for the given key every time you call
it. Or something. Whatever the situation is, the code I tried below is way
slower than I think it ought to be.
Does anyone have any suggestions (besides "write it in C")? Could _winreg
possibly be improved to provide an iterator or better interface to get the
subkeys? (or certain ones? There are a lot of keys under HKEY_CLASSES_ROOT,
and I only need the ones that start with a period). Should I file this as a
feature request?
Thanks
-Mike
from _winreg import HKEY_CLASSES_ROOT, OpenKey, EnumKey, QueryValueEx
i = 0
typemap = {}
try:
while 1:
subkeyname = EnumKey(HKEY_CLASSES_ROOT, i)
try:
subkey = OpenKey(HKEY_CLASSES_ROOT, subkeyname)
if subkeyname[:1] == '.':
data = QueryValueEx(subkey, 'Content Type')[0]
print subkeyname, '=', data
typemap[subkeyname] = data # data will be unicode
except EnvironmentError, WindowsError:
pass
i += 1
except WindowsError:
pass
Nice and short summary this time. Plan to send this off Wednesday or Thursday
so get corrections in before then.
------------------------------
=====================
Summary Announcements
=====================
You can still `register <http://www.python.org/pycon/2005/register.html>`__ for
`PyCon`_. The `schedule of talks`_ is now online. Jim Hugunin is lined up to
be the keynote speaker on the first day with Guido being the keynote on
Thursday. Once again PyCon looks like it is going to be great.
On a different note, as I am sure you are all aware I am still about a month
behind in summaries. School this quarter for me has just turned out hectic. I
think it is lack of motivation thanks to having finished my 14 doctoral
applications just a little over a week ago (and no, that number is not a typo).
I am going to for the first time in my life come up with a very regimented
study schedule that will hopefully allow me to fit in weekly Python time so as
to allow me to catch up on summaries.
And this summary is not short because I wanted to finish it. 2.5 was released
just before the time this summary covers so most stuff was on bug fixes
discovered after the release.
.. _PyCon: http://www.pycon.org/
.. _schedule of talks: http://www.python.org/pycon/2005/schedule.html
=======
Summary
=======
-------------
PEP movements
-------------
I introduced a `proto-PEP
<http://mail.python.org/pipermail/python-dev/2005-January/050753.html>`__ to
the list on how one can go about changing CPython's bytecode. It will need
rewriting once the AST branch is merged into HEAD on CVS. Plus I need to get a
PEP number assigned to me. =)
Contributing threads:
- ` proto-pep: How to change Python's bytecode <>`__
------------------------------------
Handling versioning within a package
------------------------------------
The suggestion of extending import syntax to support explicit version
importation came up. The idea was to have something along the lines of
``import foo version 2, 4`` so that one can have packages that contain
different versions of itself and to provide an easy way to specify which
version was desired.
The idea didn't fly, though. The main objection was that import-as support was
all you really needed; ``import foo_2_4 as foo``. And if you had a ton of
references to a specific package and didn't want to burden yourself with
explicit imports, one can always have a single place before codes starts
executing doing ``import foo_2_4; sys.modules["foo"] = foo_2_4``. And that
itself can even be lower by creating a foo.py file that does the above for you.
You can also look at how wxPython handles it at
http://wiki.wxpython.org/index.cgi/MultiVersionInstalls .
Contributing threads:
- `Re: [Pythonmac-SIG] The versioning question... <>`__
===============
Skipped Threads
===============
- Problems compiling Python 2.3.3 on Solaris 10 with gcc 3.4.1
- 2.4 news reaches interesting places
see `last summary`_ for coverage of this thread
- RE: [Python-checkins] python/dist/src/Modules posixmodule.c, 2.300.8.10,
2.300.8.11
- mmap feature or bug?
- Re: [Python-checkins] python/dist/src/Pythonmarshal.c, 1.79, 1.80
- Latex problem when trying to build documentation
- Patches: 1 for the price of 10.
- Python for Series 60 released
- Website documentation - link to descriptor information
- Build extensions for windows python 2.4 what are the compiler rules?
- Re: [Python-checkins] python/dist/src setup.py, 1.208, 1.209
- Zipfile needs?
fake 32-bit unsigned int overflow with ``x = x & 0xFFFFFFFFL`` and signed
ints with the additional ``if x & 0x80000000L: x -= 0x100000000L`` .
- Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.1, 1.2