[Python-ideas] Add more information in the header of pyc files

Nick Coghlan ncoghlan at gmail.com
Wed Apr 11 10:23:48 EDT 2018


On 11 April 2018 at 02:54, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Tue, 10 Apr 2018 19:29:18 +0300
> Serhiy Storchaka <storchaka at gmail.com>
> wrote:
>>
>> A bugfix release can fix bugs in bytecode generation. See for example
>> issue27286. [1]  The part of issue33041 backported to 3.7 and 3.6 is an
>> other example. [2]  There were other examples of compatible changing the
>> bytecode. Without bumping the magic number these fixes can just not have
>> any effect if existing pyc files were generated by older compilers. But
>> bumping the magic number in a bugfix release can lead to rebuilding
>> every pyc file (even unaffected by the fix) in distributives.
>
> Sure, but I don't think rebuilding every pyc file is a significant
> problem.  It's certainly less error-prone than cherry-picking which
> files need rebuilding.

And we need to handle the old bytecode format in the eval loop anyway,
or else we'd be breaking compatibility with bytecode-only files, as
well as introducing a significant performance regression for
non-writable bytecode caches (if we were to ignore them).

It's a subtle enough problem that I think the `compileall --force`
option is a safer way of handling it, even if it regenerates some pyc
files that could have been kept.

For the "stable file signature" aspect, does that need to be
specifically the first *four* bytes? One of the benefits of PEP 552
leaving those four bytes alone is that it meant that a lot of magic
number checking code didn't need to change. If the stable marker could
be placed later (e.g. after the PEP 552 header), then we'd similarly
have the benefit that code checking the PEP 552 headers wouldn't need
to change, at the expense of folks having to read 20 bytes to see the
new signature byte (which shouldn't be a problem, given that file
defaults to reading up to 1 MiB from files it is trying to identify).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list