[Python-ideas] Merge .pyc into .pyo--store multiple code objects in one file?
Larry Hastings
larry at hastings.org
Tue Feb 2 05:07:24 CET 2010
A .pyc file is made up of these three elements:
4 byte magic number
4 byte timestamp
marshaled code object
A .pyo file is the same except the code object has been optimized.
I ask you: why gunk up the filesystem with two files when one would do?
I propose we change the pyc file so it can contain multiple code
objects. Or, indeed, multiple arbitrary objects. The .pyc file could
become as a general-purpose cache for data relevant to the .py file.
For example, perhaps the Unladen Swallow guys could cache post-JITted
code. Or the wordcode-based interpreter could cache its wordcode
equivalent.
Lots of implementation choices suggest themselves. Here's the cheapest
approach that seems suitable:
4 byte magic number
4 byte timestamp
marshaled array of pairs of ints, alternating "id of object"
with "relative offset of object from the end of this marshaled
array"
marshaled code object 1
marshaled code object 2
...
If you look for your cached object inside and it's not there, you
compute your object then rewrite the file adding your id and offset to
the end of the array. (Your offset will always be the former size of
the file.) If the timestamp has changed, blow away all objects and
start over with an empty array.
If it'd be too disruptive to change .pyc/.pyo files this way, would
switching to a new file extension be better?
I suspect this probably isn't actually a good idea,
/larry/
More information about the Python-ideas
mailing list