[Python-Dev] Thread-safe file objects, the return

Guido van Rossum guido at python.org
Wed Apr 2 02:09:24 CEST 2008


This is not something that keeps me awake at night, but I am aware of
it. Your solution (a counter) seems fine except I think perhaps the
close() call should not raise IOError -- instead, it should set a flag
so that the thread that makes the counter go to zero can close the
thread (after all the file got closed while it was being used).

There are of course other concurrency issues besides close -- what if
two threads both try to do I/O on the file? What will the C stdio
library do in that case? Are stdio files thread-safe at the C level?
So (classically contradicting myself while I think the problem over
more) perhaps any I/O operation should be disallowed while the file is
in use by another thread?

--Guido

On Mon, Mar 31, 2008 at 1:09 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>
>  Hello,
>
>  It seems this subject has had quite a bit of history. Tim Peters demonstrated
>  the problem in 2003 in this message:
>  http://mail.python.org/pipermail/python-dev/2003-June/036537.html
>
>  In short, Python file objects release the GIL before calling any C stdlib
>  function on their embedded FILE pointer. Unfortunately, if another thread
>  calls fclose on the FILE pointer concurrently, the contents pointed to can
>  become garbage and the interpreter process crashes. Just by using the same
>  file object in two threads running pure Python code, you can crash the
>  interpreter.
>
>  (another, easier-to-solve problem is that the FILE pointer stored in the
>  file object could become NULL at the point it is used by another thread.
>  If that was the only problem you could just store the FILE pointer in a
>  local variable before releasing the GIL et voilà)
>
>  There was some discussion at the time about the possible resolution. I've
>  tried to fix the problem, and I've come to what I think is a satisfying
>  solution, which I can sum up as the following bullet points:
>   * Each file object gets a dedicated counter, which is incremented before
>  the bject releases the GIL and decremented after the GIL is taken again; thus
>  this counter keeps track of how many running "unlocked" sections of code are
>  using that particular file object. (please note the counter doesn't need its
>  own lock, since it is only modified in GIL-protected sections)
>   * In the close() method, if the aforementioned counter is greater than 0,
>  we refuse to call fclose and instead raise an IOError.
>
>  This may seem like a worrying semantic change, but I don't think it is, for the
>  following reasons:
>   1) if we closed the FILE pointer anyway, the interpreter would likely crash
>  because another thread would be using garbage data (that's what we are trying
>  to fix after all!)
>   2) if close() raises an IOError, it can be called again later, or at worse
>  fclose will be called when the file object is garbage collected
>   3) close() can already raise an IOError if fclose fails for whatever reason
>  (although for sure it's probably very rare)
>   4) it doesn't seem wrong to notify the programmer that his code is very
>  unsafe
>
>  The patch is attached at http://bugs.python.org/issue815646 . It addresses
>  (or at least I hope it does) all potential problems with pure Python code,
>  threads, and the file object. It doesn't try to fix C extensions using the
>  PyFile_AsFile API and doing their own dirty things with the FILE pointer. It
>  could be a second step if the approach is accepted, but as noted in the 2003
>  discussions it would probably involve a new API. Whether we want to introduce
>  such an API in Python 2.x while Python 3.0 has a different IO model anyway
>  is left open to discussion :)
>
>  Regards
>
>  Antoine.
>
>
>  _______________________________________________
>  Python-Dev mailing list
>  Python-Dev at python.org
>  http://mail.python.org/mailman/listinfo/python-dev
>  Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list