[Python-Dev] Proposal: .pyc file format change
Peter Funk
pf@artcom-gmbh.de
Fri, 26 May 2000 13:50:02 +0200 (MEST)
[M.-A. Lemburg]:
> > Proposal:
> > The future format (Python 1.6 and newer) of a .pyc file should be as follows:
> >
> > bytes 0-3 a new magic number, which should be definitely frozen in 1.6.
> > bytes 4-7 a version number (which should be == 1 in Python 1.6)
> > bytes 8-11 timestamp (mtime of .py file) (same as earlier)
> > bytes 12-* marshalled code object (same as earlier)
>
> This will break all tools relying on having the code object available
> in bytes[8:] and believe me: there are lots of those around ;-)
In some way, this is intentional: If these tools (are there are really
that many out there, that munge with .pyc byte code files?) simply use
'imp.get_magic()' and then silently assume a specific content of the
marshalled code object, they probably need changes anyway, since the
code needed to deal with the new unicode object is missing from them.
> You cannot really change the file header, only add things to the end
> of the PYC file...
Why? Will this idea really cause such earth quaking grumbling?
Please review this in the context of my proposal to change 'imp.get_magic()'
to return the old 1.5.2 MAGIC, when called without parameter.
> Hmm, or perhaps we should move the version number to the code object
> itself... after all, the changes we want to refer to
> using the version number are located in the code object and not the
> PYC file layout. Unmarshalling it would then raise the error.
Since the file layout is a very thin layer around the marshalled
code object, this makes really no big difference to me. But it
will be harder to come up with reasonable entries for /etc/magic [1]
and similar mechanisms.
Putting the version number at the end of file is possible.
But such a solution is some what "dirty" and only gives the false
impression that the general file layout (pyc[8:] instead of pyc[12:])
is something you can rely on until the end of time. Hardcoding the
size of an unpadded header (something like using buffer[8:]) is IMO
bad style anyway.
Regards, Peter
[1]: /etc/magic on Unices is a small textual data base used by the 'file'
command to identify the type of a file by looking at the first
few bytes. Unix file managers may either use /etc/magic directly
or a similar scheme to asciociate files with mimetypes and/or default
applications.