[ python-Bugs-1488934 ] file.write + closed pipe = no error

SourceForge.net noreply at sourceforge.net
Wed Aug 9 18:13:42 CEST 2006


Bugs item #1488934, was opened at 2006-05-15 12:10
Message generated for change (Comment added) made by edemaine
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1488934&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Submitted By: Erik Demaine (edemaine)
Assigned to: A.M. Kuchling (akuchling)
Summary: file.write + closed pipe = no error

Initial Comment:
I am writing a Python script on Linux that gets called
via ssh (ssh hostname script.py) and I would like it to
know when its stdout gets closed because the ssh
connection gets killed.  I assumed that it would
suffice to write to stdout, and that I would get an
error if stdout was no longer connected to anything. 
This is not the case, however.  I believe it is because
of incorrect error checking in Objects/fileobject.c's
file_write.

Consider this example:

while True:
__print 'Hello'
__time.sleep (1)

If this program is run via ssh and then the ssh
connection dies, the program continues running forever
(or at least, over 10 hours).  No exceptions are thrown.

In contrast, this example does die as soon as the ssh
connection dies (within one second):

while True:
__os.write (1, 'Hello')
__time.sleep (1)

I claim that this is because os.write does proper error
checking, but file.write seems not to.  I was surprised
to find this intricacy in fwrite().  Consider the
attached C program, test.c.  (Warning: If you run it,
it will create a file /tmp/hello, and it will keep
running until you kill it.)  While the ssh connection
remains open, fwrite() reports a length of 6 bytes
written, ferror() reports no error, and errno remains
0.  Once the ssh connection dies, fwrite() still
reports a length of 6 bytes written (surprise!), but
ferror(stdout) reports an error, and errno changes to 5
(EIO).  So apparently one cannot tell from the return
value of fwrite() alone whether the write actually
succeeded; it seems necessary to call ferror() to
determine whether the write caused an error.

I think the only change necessary is on line 2443 of
file_write() in Objects/fileobject.c (in svn version
46003):

2441        n2 = fwrite(s, 1, n, f->f_fp);
2442        Py_END_ALLOW_THREADS
2443        if (n2 != n) {
2444                PyErr_SetFromErrno(PyExc_IOError);
2445                clearerr(f->f_fp);

I am not totally sure whether the "n2 != n" condition
should be changed to "n2 != n || ferror (f->f_fp)" or
simply "ferror (f->f_fp)", but I believe that the
condition should be changed to one of these
possibilities.  The current behavior is wrong.

Incidentally, you'll notice that the C code has to turn
off signal SIGPIPE (like Python does) in order to not
die right away.  However, I could not get Python to die
by re-enabling SIGPIPE.  I tried "signal.signal
(signal.SIGPIPE, signal.SIG_DFL)" and "signal.signal
(signal.SIGPIPE, lambda x, y: sys.exit ())" and neither
one caused death of the script when the ssh connection
died.  Perhaps I'm not using the signal module correctly?

I am on Linux 2.6.11 on a two-CPU Intel Pentium 4, and
I am running the latest Subversion version of Python,
but my guess is that this error transcends most if not
all versions of Python.

----------------------------------------------------------------------

>Comment By: Erik Demaine (edemaine)
Date: 2006-08-09 12:13

Message:
Logged In: YES 
user_id=265183

Just to clarify (as I reread your question): I'm killing the
ssh via UNIX (or Cygwin) 'kill' command, not via CTRL-C.  I
didn't try, but it may be that CTRL-C works fine.

----------------------------------------------------------------------

Comment By: Erik Demaine (edemaine)
Date: 2006-07-02 08:35

Message:
Logged In: YES 
user_id=265183

A simple test case is this Python script (fleshed out from
previous example), also attached:

import sys, time
while True:
__print 'Hello'
__sys.stdout.flush ()
__time.sleep (1)

Save as blah.py on machine foo, run 'ssh foo python blah.py'
on machine bar--you will see 'Hello' every second--then, in
another shell on bar, kill the ssh process on bar.  blah.py
should still be running on foo.  ('foo' and 'bar' can
actually be the same machine.)

The example from the original bug report that uses
os.write() instead of print was an example that *does* work.


----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2006-06-03 16:16

Message:
Logged In: YES 
user_id=11375

I agree with your analysis, and think your suggested fixes are correct.

However, I'm unable to construct a small test case that exercises this bug.  I 
can't even replicate the problem with SSH; when I run a remote script with 
SSH and then kill SSH with Ctrl-C, the write() gets a -1.  Are you terminating 
SSH in some other way?  (I'd really like to have a test case for this problem 
before committing the fix.)


----------------------------------------------------------------------

Comment By: Erik Demaine (edemaine)
Date: 2006-05-15 12:26

Message:
Logged In: YES 
user_id=265183

One more thing: fwrite() is used in a couple of other
places, and I think the same comment applies to them.  They are:

- file_writelines() in Objects/fileobject.c
- w_string() in Python/marshal.c doesn't seem to have any
error checking?  (At least no ferror() call in marhsal.c...)
- string_print() in Objects/stringobject.c doesn't seem to
have any error checking (but I'm not quite sure what this
means in Python land).
- flush_data() in Modules/_hotshot.c
- array_tofile() in Modules/arraymodule.c
- write_file() in Modules/cPickle.c
- putshort(), putlong(), writeheader(), writetab() [and the
functions that call them] in Modules/rgbimgmodule.c
- svc_writefile() in Modules/svmodule.c

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1488934&group_id=5470


More information about the Python-bugs-list mailing list