[Python-Dev] Draft PEP to make file objects support non-blocking mode.

Fri Mar 18 04:30:27 CET 2005

G'day,

the recent thread about thread semantics for file objects reminded me I
had a draft pep for extending file objects to support non-blocking
mode. 

This is handy for handling files in async applications (the non-threaded
way of doing things concurrently).

Its pretty rough, but if I fuss over it any more I'll never get it
out...

-- 
Donovan Baarda <abo at minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/
-------------- next part --------------
PEP: XXX
Title: Make builtin file objects support non-blocking mode
Version: $Revision: 1.0 $
Last-Modified: $Date: 2005/03/18 11:34:00 $
Author: Donovan Baarda <abo at minkirri.apana.org.au>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 06-Jan-2005
Python-Version: 3.5
Post-History: 06-Jan-2005

Abstract
========

This PEP suggests a way that the existing builtin file type could be 
extended to better support non-blocking read and write modes required for 
asynchronous applications using things like select and popen2.

Rationale
=========

Many Python library methods and classes like select.select(), os.popen2(),
and subprocess.Popen() return and/or operate on builtin file objects.
However even simple applications of these methods and classes require the
files to be in non-blocking mode.

Currently the built in file type does not support non-blocking mode very
well.  Setting a file into non-blocking mode and reading or writing to it
can only be done reliably by operating on the file.fileno() file descriptor.
This requires using the fnctl and os module file descriptor manipulation
methods.

Details
=======

The documentation of file.read() warns; "Also note that when in non-blocking
mode, less data than what was requested may be returned, even if no size
parameter was given".  An empty string is returned to indicate an EOF
condition.  It is possible that file.read() in non-blocking mode will not
produce any data before EOF is reached.  Currently there is no documented
way to identify the difference between reaching EOF and an empty
non-blocking read.

The documented behaviour of file.write() in non-blocking mode is undefined.
When writing to a file in non-blocking mode, it is possible that not all of
the data gets written.  Currently there is no documented way of handling or
indicating a partial write.

The file.read() and file.write() methods are implemented using the
underlying C read() and write() fuctions.  As a side effect of this, they
have the following undocumented behaviour when operating on non-blocking
files;

A file.write() that fails to write all the provided data immediately will
write part of the data, then raise IOError with an errno of EAGAIN.  There
is no indication how much of the data was successfully written.

A file.read() that fails to read all the requested data immediately will
return the partial data that was read.  A file.read() that fails to read any
data immediately will raise IOError with an errno of EAGAIN.

Proposed Changes
========================

What is required is to add a setblocking() method that simplifies setting
non-blocking mode, and extending/documenting read() and write() so they can
be reliably used in non-blocking mode.

file.setblocking(flag) Extension
--------------------------------

This method implements the socket.setblocking() method for file objects.  if
flag is 0, the file is set to non-blocking, else to blocking mode.  

file.read([size]) Changes
--------------------------

The read method's current behaviour needs to be documented, so its actual
behaviour can be used to differentiate between an empty non-blocking read,
and EOF.  This means recording that IOError(EAGAIN) is raised for an empty
non-blocking read.

file.write(str) Changes
--------------------

The write method needs to have a useful behaviour for partial non-blocking
writes defined, implemented, and documented.  This includes returning how
many bytes of "str" are successfully written, and raising IOError(EAGAIN)
for an unsuccessful write (one that failed to write anything).

Impact of Changes
=================

As these changes are primarily extensions, they should not have much impact
on any existing code.

The file.read() changes are only documenting current behaviour. This could
have no impact on any existing code.

The file.write() change makes this method return an int instead of returning
nothing (None). The only code this could affect would be something relying
on file.write() returning None. I suspect there is no code that would do
this.

The file.setblocking() change adds a new method. The only existing code this
could affect is code that checks for the presense/absense of a setblocking
method on a file. There may be code out there that does this to
differentiate between a file and a socket. As there are much better ways to
do this, I suspect that there would be no code that does this.

Examples
========

For example, the following simple code using popen2 will "hang" if the
huge_in string is larger than the os buffering can read/write in one hit.

  import os

  child_in, child_out = os.popen2("/usr/bin/cat")
  child_in.write(huge_in)
  huge_out = child_out.read()

The only safe way to read and write to the popen2 files and avoid blocking,
without special knowledge of the io behaviour of the executed command, is to
use non-blocking mode. To set a file object "f" into non-blocking mode
requires manipulating the file's file descriptor using the Python library
fnctl module as follows;

  import os,fnctl

  # get the file descriptor
  fd = f.fileno()

  # get the file's current flag settings
  fl = fcntl.fcntl(fd, fcntl.F_GETFL)

  # update the file's flags to put the file into non-blocking mode.
  fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)

Once a file is in non-blocking mode, the file object's read() and write()
methods cannot reliably be used. Instead you must use the os.read() and
os.write() methods on the fileno() of the file;

  import os

  str = os.read(f.fileno(), count)

  count = os.write(f.fileno(), str)

Implementation
===============

Right now, this functionality can be implemented using an extended file class

import os,fnctl

class File(file):

  def setblocking(self,flag):
    " set/clear blocking mode"
    # get the file descriptor
    fd = f.fileno()
    # get the file's current flag settings
    fl = fcntl.fcntl(fd, fcntl.F_GETFL)
    if flag:
      # clear non-blocking mode from flags
      fl = fl & ~os.O_NONBLOCK
    else:
      # set non-blocking mode from flags
      fl = fl | os.O_NONBLOCK
    # update the file's flags
    fcntl.fcntl(fd, fcntl.F_SETFL, fl)

  def write(self,str):
    try:
      return os.write(self.fileno(),str)
    except OSError,inst:
      raise IOError(inst.errno,inst.strerror,inst.filename)

A real implementation should be done by modifying the C implementations of
the built-in file type.

Resources
=========

.. [1] Posix write() manual page.
   (man 3 write)

.. [2] Poxix read() manual page.
   (man 3 read)

References
==========

Copyright
=========

This document has been placed in the public domain.

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End: