
A .pyc file is made up of these three elements: 4 byte magic number 4 byte timestamp marshaled code object A .pyo file is the same except the code object has been optimized. I ask you: why gunk up the filesystem with two files when one would do? I propose we change the pyc file so it can contain multiple code objects. Or, indeed, multiple arbitrary objects. The .pyc file could become as a general-purpose cache for data relevant to the .py file. For example, perhaps the Unladen Swallow guys could cache post-JITted code. Or the wordcode-based interpreter could cache its wordcode equivalent. Lots of implementation choices suggest themselves. Here's the cheapest approach that seems suitable: 4 byte magic number 4 byte timestamp marshaled array of pairs of ints, alternating "id of object" with "relative offset of object from the end of this marshaled array" marshaled code object 1 marshaled code object 2 ... If you look for your cached object inside and it's not there, you compute your object then rewrite the file adding your id and offset to the end of the array. (Your offset will always be the former size of the file.) If the timestamp has changed, blow away all objects and start over with an empty array. If it'd be too disruptive to change .pyc/.pyo files this way, would switching to a new file extension be better? I suspect this probably isn't actually a good idea, /larry/