[New-bugs-announce] [issue3014] file_dealloc() assumes errno is set when EOF is returned

johansen report at bugs.python.org
Fri May 30 23:54:11 CEST 2008

New submission from johansen <johansen at sun.com>:

We're using Python to build the new packaging system for OpenSolaris. 
Yesterday, a user reported that when they ran the pkg command, piped the
output to grep, and then typed ^C, sometimes they'd get this error:

$ pkg list | grep office
^Cclose failed: [Errno 11] Resource temporarily unavailable

We assumed that this might be a problem in the signal handling we've
employed to catch SIGPIPE; however, it turns out that the problem is in
the file_dealloc() code.

For the perversely curious, additional details may be found in the
original bug located here:


Essentially we found the following:

The error message is emitted from fileobject.c: file_dealloc()

The relevant portion of the routine looks like this:

static void
file_dealloc(PyFileObject *f)
        int sts = 0;
        if (f->weakreflist != NULL)
                PyObject_ClearWeakRefs((PyObject *) f);
        if (f->f_fp != NULL && f->f_close != NULL) {
                sts = (*f->f_close)(f->f_fp);
                if (sts == EOF) 
                        PySys_WriteStderr("close failed: [Errno %d] %s\n",
errno, strerror(errno)); 

In the cases we encountered, the function pointer f_close is actually a
call to sysmodule.c: _check_and_flush()

That routine looks like this:

static int
_check_and_flush (FILE *stream)
  int prev_fail = ferror (stream);
  return fflush (stream) || prev_fail ? EOF : 0;

check_and_flush calls ferror(3C) and then fflush(3C) on the FILE stream
associated with the file object.  There's just one problem here.  If it
finds an error that was previously encountered on the file stream,
there's no guarantee that errno will be valid.  Should an error be
encountered in fflush(3C), errno will get set; however, the contents of
errno are undefined should fflush() return successfully.

Here's what happens in the code I observed:

I set a write watchpoint on errno and observed the different times it
was accessed.  After sifting through a bunch of red-herrings, I found
that a call to PyThread_acquire_lock() that sets errno to 11 (EAGAIN). 
This occurs when PyThread_acquire_lock() calls sem_trywait(3C) and finds
the semaphore already locked.  Errno doesn't get accessed again until a
call to libc.so.1`isseekable() that simply saves and restores the
existing errno.

Since we've taken a ^C (SIGINT), the interpreter begins the finalization
process and eventually calls file_dealloc().  This routine calls
_check_and_flush().  In the case that I observed, ferror(3C)
returns a non-zero value but fflush(3C) completes successfully.  This
causes the routine to return EOF to the caller.  file_dealloc() assumes
that since it received an EOF an error occurred and it should call
strerror(errno).  However, since this is just returning the state of a
previous error, errno is invalid.

This is what causes the spurious EAGAIN message.  Just to be sure, I
traced the return value and errno of failed syscalls that were invoked
by the interpreter.  I was unable to observe any syscalls returning
EAGAIN.  This is because (at least on OpenSolaris) sem_trywait(3C) calls
sema_trywait(3C).  The sema_trywait returns EBUSY if the semaphore is
held and sem_trywait converts this to EAGAIN.  None of these errors are
passed out of the kernel.

It's not clear to me whether _check_and_flush(), file_dealloc(), or both
need modification.  At a minimum, it's not safe for file_dealloc() to
assume that errno is set correctly if the function underneath it is
using ferror(3C) to find the presence of an error on the stream.

components: Interpreter Core
messages: 67560
nosy: johansen
severity: normal
status: open
title: file_dealloc() assumes errno is set when EOF is returned
type: behavior
versions: Python 2.4

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list