On Tue, 10 Apr 2018 19:14:58 +0300 Serhiy Storchaka <storchaka@gmail.com> wrote:
Currently pyc files contain data that is useful mostly for developing and is not needed in most normal cases in stable program. There is even an option that allows to exclude a part of this information from pyc files. It is expected that this saves memory, startup time, and disk space (or the time of loading from network). I propose to move this data from pyc files into separate file or files. pyc files should contain only external references to external files. If the corresponding external file is absent or specific option suppresses them, references are replaced with None or NULL at import time, otherwise they are loaded from external files.
1. Docstrings. They are needed mainly for developing.
2. Line numbers (lnotab). They are helpful for formatting tracebacks, for tracing, and debugging with the debugger. Sources are helpful in such cases too. If the program doesn't contain errors ;-) and is sipped without sources, they could be removed.
3. Annotations. They are used mainly by third party tools that statically analyze sources. They are rarely used at runtime.
Docstrings will be read from the corresponding docstring file unless -OO is supplied. This will allow also to localize docstrings. Depending on locale or other settings different docstring file can be used.
An alternate proposal would be to have separate sections in a single marshal file. The main section (containing the loadable module) would have references to the other sections. This way it's easy for the loader to say "all references to the docstring section and/or to the annotation section are replaced with None", depending on how Python is started. It would also be possible to do it on disk with a strip-like utility. I'm not volunteering to do all this, so just my 2 cents ;-) Regards Antoine.