[Python-Dev] Mysterious Python pyc file corruption problems

Terry Jan Reedy tjreedy at udel.edu
Fri May 17 21:02:00 CEST 2013


On 5/17/2013 12:42 PM, Barry Warsaw wrote:
> On May 16, 2013, at 04:52 PM, Terry Jan Reedy wrote:

>> Do failures only occur during compileall process? (or whatever substitute you
>> use).
>
> No, they are all post-installation failures in unrelated packages that try to
> import pure-Python modules.

What I mean is, is the corruption (not the detection of corruption) only 
happening during mass compilation of the stdlib? When user imports a 
single non-stdlib file he has written the first time, does that ever get 
corrupted.

 > AFAICT, the post-installation byte-compilation scripts are not erroring.

I THINK that you are answering my question by saying that corruption 
only happens during installation mass compilation.
>
> Doing a post-compilation verification step might be interesting, but I bet
> backporting atomic renames to py_compile.py will fix the problem, or at least
> band-aid over it. ;)

I intended to suggest that py_compile be changed to do that. Then Brett 
said it already had for 3.4. I see no reason why not to backport, but 
maybe someone else will.

The main design use of marshal is to produce .pyc files that consist of 
a prefix and marshalled codeobject. Perhaps marshal.dump(s) should be 
extended to take a third  prefix='' parameter that would be prepended to 
the result as produced today. .dump first does .dumps, though inline. I 
assume that .dumps constructs a string by starting with [], appending 
pieces, and joining. At least, any composite object dump would. The 
change would amount to starting with [prefix] instead of []. Then 
py_compile would amount to
   pyc = <pyc-prefix>
   marshal.dump(codeobject, file, pyc)

Terry





More information about the Python-Dev mailing list