[Python-Dev] thread semantics for file objects

Tim Peters tim.peters at gmail.com
Thu Mar 17 23:13:05 CET 2005


[Jeremy Hylton]
> Are the thread semantics for file objecst documented anywhere?

No.  At base level, they're inherited from the C stdio implementation.
 Since the C standard doesn't even mention threads, that's all
platform-dependent.  POSIX defines thread semantics for file I/O, but
fat lot of good that does you on Windows, etc.

> I don't see anything in the library manual, which is where I expected to
> find it.  It looks like read and write are atomic by virtue of fread
> and fwrite being atomic.

I wouldn't consider this as more than CPython implementation accidents
in the cases it appears to apply.  For example, in universal-newlines
mode, are you sure f.read(n) always maps to exactly one fread() call?

> I'm less sure what guarantees, if any, the other methods attempt to
> provide.

I don't believe they're _trying_ to provide anything specific.

> For example, it looks like concurrent calls to writelines() will interleave entire
> lines, but not parts of lines.  Concurrent calls to readlines() provide insane
> results, but I don't know if that's a bug or a feature.  Specifically, if your file has a
> line that is longer than the internal buffer size SMALLCHUNK you're likely to
> get parts of that line chopped up into different lines in the resulting return values.

And you're _still_ not thinking "implementation accidents" <wink>?

> If we can come up with intended semantics, I'd be willing to prepare a
> patch for the documentation.

I think Aahz was on target here:

    NEVER, NEVER access the same file object from multiple threads, unless
    you're using a lock.

And here he went overboard:

    And even using a lock is stupid.

ZODB's FileStorage is bristling with locks protecting multi-threaded
access to file objects, therefore that can't be stupid.  QED


More information about the Python-Dev mailing list