Technically the main concern may be the size of the unmarshalled pyc files
in memory, more than the storage size of disk.
On Fri, 7 May 2021, 23:04 Antoine Pitrou,
On Fri, 7 May 2021 22:45:38 +0100 Pablo Galindo Salgado
wrote: The cost of this is having the start column number and end column number information for every bytecode instruction and this is what we want to discuss (there is also some stack cost to re-raise exceptions but that's not a big problem in any case). Given that column numbers are not very big compared with line numbers, we plan to store these as unsigned chars or unsigned shorts. We ran some experiments over the standard library and we found that the overhead of all pyc files is:
* If we use shorts, the total overhead is ~3% (total size 28MB and the extra size is 0.88 MB). * If we use chars. the total overhead is ~1.5% (total size 28 MB and the extra size is 0.44MB).
More generally, if some people in 2021 are still concerned with the size of pyc files (why not), how about introducing a new version of the pyc format with built-in LZ4 compression?
LZ4 decompression is extremely fast on modern CPUs (several GB/s) and vendoring the C library should be simple. https://github.com/lz4/lz4
Regards
Antoine.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PQZ6OTWG... Code of Conduct: http://python.org/psf/codeofconduct/