[Python-ideas] Move optional data out of pyc files
Nick Coghlan
ncoghlan at gmail.com
Wed Apr 11 10:42:24 EDT 2018
On 11 April 2018 at 02:14, Serhiy Storchaka <storchaka at gmail.com> wrote:
> Currently pyc files contain data that is useful mostly for developing and is
> not needed in most normal cases in stable program. There is even an option
> that allows to exclude a part of this information from pyc files. It is
> expected that this saves memory, startup time, and disk space (or the time
> of loading from network). I propose to move this data from pyc files into
> separate file or files. pyc files should contain only external references to
> external files. If the corresponding external file is absent or specific
> option suppresses them, references are replaced with None or NULL at import
> time, otherwise they are loaded from external files.
>
> 1. Docstrings. They are needed mainly for developing.
>
> 2. Line numbers (lnotab). They are helpful for formatting tracebacks, for
> tracing, and debugging with the debugger. Sources are helpful in such cases
> too. If the program doesn't contain errors ;-) and is sipped without
> sources, they could be removed.
>
> 3. Annotations. They are used mainly by third party tools that statically
> analyze sources. They are rarely used at runtime.
While I don't think the default inline pyc format should change, in my
ideal world I'd like to see the optimized format change to a
side-loading model where these things are still emitted, but they're
placed in a separate metadata file that isn't loaded by default.
The metadata file would then be lazily loaded at runtime, such that
`-O` gave you the memory benefits of `-OO`, but
docstrings/annotations/source line references/etc could still be
loaded on demand if something actually needed them. This approach
would also mitigate the valid points Chris Angelico raises around hot
reloading support - we could just declare that it requires even more
care than usual to use hot reloading in combination with `-O`.
Bonus points if the sideloaded metadata file could be designed in such
a way that an extension module compiler like Cython or an alternate
pyc compiler frontend like Hylang could use it to provide relevant
references back to the original source code (JavaScript's source maps
may provide inspiration on that front).
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-ideas
mailing list