[Python-Dev] Mysterious Python pyc file corruption problems

Guido van Rossum gvanrossum at gmail.com
Thu May 16 23:40:07 CEST 2013

I still suspect this might explain most of what Barry saw, if not all. 
Sent from Mailbox

On Thu, May 16, 2013 at 2:36 PM, Brett Cannon <brett at python.org> wrote:

> On Thu, May 16, 2013 at 5:19 PM, Guido van Rossum <guido at python.org> wrote:
>> This reminds me of the following bug, which can happen when two
>> processes are both writing the .pyc file and a third is reading it.
>> First some background.
>> When writing a .pyc file, we use the following strategy:
>> - open the file for writing
>> - write a dummy header (four null bytes)
>> - write the .py file's mtime
>> - write the marshalled code object
>> - replace the dummy heaer with the correct magic word
> Just so people know, this is how we used to do it. In importlib we
> write the entire file to a temp file and then to an atomic rename.
>> Even py_compile.py (used by compileall.py) uses this strategy.
> py_compile as of Python 3.4 now just uses importlib directly, so it
> matches its semantics.
> -Brett
>> When reading a .pyc file, we ignore it when the magic word isn't there
>> (or when the mtime doesn't match that of the .py file exactly), and
>> then we will write it back like described above.
>> Now consider the following scenario. It involves *three* processes.
>> - Two unrelated processes both start and want to import the same module.
>> - They both see the .pyc file is missing/corrupt and decide to write it.
>> - The first process finishing writing the file, writing the correct header.
>> - Now a third process wants to import the module, sees the valid
>> header, and starts reading the file.
>> - However, while this is going on, the second process gets ready to
>> write the file.
>> - The second process truncates the file, writes the dummy header, and
>> then stalls.
>> - At this point the third process (which thought it was reading a
>> valid file) sees an unexpected EOF because the file has been
>> truncated.
>> Now, this would explain the EOFError, but not necessarily the
>> ValueError with "unknown type code". However, it looks like marshal
>> doesn't always check for EOF immediately (sometimes it calls getc()
>> without checking the result, and sometimes it doesn't check the error
>> state after calling r_string()), so I think all the errors are
>> actually explainable from this scenario.
>> --
>> --Guido van Rossum (python.org/~guido)
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20130516/c3c6217e/attachment.html>

More information about the Python-Dev mailing list