[Python-Dev] Proposal: .pyc file format change

M.-A. Lemburg mal@lemburg.com
Fri, 26 May 2000 16:27:03 +0200

Peter Funk wrote:
> [M.-A. Lemburg]:
> > > Proposal:
> > > The future format (Python 1.6 and newer) of a .pyc file should be as follows:
> > >
> > > bytes 0-3   a new magic number, which should be definitely frozen in 1.6.
> > > bytes 4-7   a version number (which should be == 1 in Python 1.6)
> > > bytes 8-11  timestamp (mtime of .py file) (same as earlier)
> > > bytes 12-*  marshalled code object (same as earlier)
> >
> > This will break all tools relying on having the code object available
> > in bytes[8:] and believe me: there are lots of those around ;-)
> In some way, this is intentional:  If these tools (are there are really
> that many out there, that munge with .pyc byte code files?) simply use
> 'imp.get_magic()' and then silently assume a specific content of the
> marshalled code object, they probably need changes anyway, since the
> code needed to deal with the new unicode object is missing from them.

That's why I proposed to change the marshalled code object
and not the PYC file: the problem is not only related to 
PYC files, it touches all areas where marshal is used. If 
you try to load a code object using Unicode in Python 1.5
you'll get all sorts of errors, e.g. EOFError, SystemError.
Since marshal uses a specific format, that format should
receive the version number.

Ideally that version would be prepended to the format (not sure
whether this is possible), so that the PYC file layout
would then look like this:

word 0: magic
word 1: timestamp
word 2: version in the marshalled code object
word 3-*: rest of the marshalled code object

Please make sure that options such as the -U option are
also respected...


A different approach to all this would be fixing only the
first two bytes of the magic word, e.g.

byte 0: 'P'
byte 1: 'Y'
byte 2: version number (counting from 1)
byte 3: option byte (8 bits: one for each option;
                     bit 0: -U cmd switch)

This would be b/w compatible and still provide file(1)
with enough information to be able to tell the file type.

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/