[Python-ideas] Move optional data out of pyc files
INADA Naoki
songofacandy at gmail.com
Thu Apr 12 06:48:07 EDT 2018
> Finally, loading docstrings and other optional components can be made lazy.
> This was not in my original idea, and this will significantly complicate the
> implementation, but in principle it is possible. This will require larger
> changes in the marshal format and bytecode.
I'm +1 on this idea.
* New pyc format has code section (same to current) and text section.
text section stores UTF-8 strings and not loaded at import time.
* Function annotation (only when PEP 563 is used) and docstring are
stored as integer, point to offset in the text section.
* When type.__doc__, PyFunction.__doc__, PyFunction.__annotation__ are
integer, text is loaded from the text section lazily.
PEP 563 will reduce some startup time, but __annotation__ is still
dict. Memory overhead is negligible.
In [1]: def foo(a: int, b: int) -> int:
...: return a + b
...:
...:
In [2]: import sys
In [3]: sys.getsizeof(foo)
Out[3]: 136
In [4]: sys.getsizeof(foo.__annotations__)
Out[4]: 240
When PEP 563 is used, there are no side effect while building the annotation.
So the annotation can be serialized in text, like
{"a":"int","b":"int","return":"int"}.
This change will require new pyc format, and descriptor for
PyFunction.__doc__, PyFunction.__annotation__
and type.__doc__.
Regards,
--
INADA Naoki <songofacandy at gmail.com>
More information about the Python-ideas
mailing list