On Tue, Oct 3, 2017, at 08:03, Barry Warsaw wrote:
Guido van Rossum wrote:
There have been no further comments. PEP 552 is now accepted.
Congrats, Benjamin! Go ahead and send your implementation for review.Oops. Let me try that again.
While I'm very glad PEP 552 has been accepted, it occurs to me that it will now be more difficult to parse the various pyc file formats from Python. E.g. I used to be able to just open the pyc in binary mode, read all the bytes, and then lop off the first 8 bytes to get to the code object. With the addition of the source file size, I now have to (maybe, if I have to also read old-style pyc files) lop off the front 12 bytes, but okay.
With PEP 552, I have to do a lot more work to just get at the code object. How many bytes at the front of the file do I need to skip past? What about all the metadata at the front of the pyc, how do I interpret that if I want to get at it from Python code?
As Guido points out, the header is just now always 4 32-bit words rather than 3. Not long ago we underwent the transition from 2-3 words without widespread disaster.
Should the PEP 552 implementation add an API, probably to importlib.util, that would understand all current and future formats? Something like this perhaps?
class PycFileSpec: magic_number: bytes timestamp: Optional[bytes] # maybe an int? datetime? source_size: Optional[bytes] bit_field: Optional[bytes] code_object: bytes
def parse_pyc(path: str) -> PycFileSpec:
I'm not sure turning the implementation details of our internal formats into APIs is the way to go.