
On Wed, Jan 13, 2021 at 1:47 AM Larry Hastings <larry@hastings.org> wrote:
On 1/11/21 5:33 PM, Inada Naoki wrote:
Note that PEP 563 semantics allows more efficient implementation. Annotation is just a single constant tuple, not a dict. We already have the efficient implementation for Python 3.10.
The efficient implementation in 3.10 can share tuples. If there are hundreds of methods with the same signature, annotation is just a single tuple, not hundreds of tuples. This is very efficient for auto generated codebase. I think this PEP can share the code objects for same signature by removing co_firstlineno information too.
That's very clever! My co_annotations repo was branched from before this feature was added, and I haven't pulled and merged recently. So I hadn't seen it.
Please see this pull request too. It merges co_code and co_consts. It will save more RAM and importing time of your implementation. https://github.com/python/cpython/pull/23056
Additionally, we should include the cost for loading annotations from PYC files, because most annotations are "load once, set once". Loading "simple code object" from pyc files is not so cheap. It may affect importing time of large annotated codebase and memory footprints.
I did some analysis in a separate message. The summary is, the code object for a single annotation costs us 232 bytes; that includes the code object itself, the bytestring for the bytecode, and the bytestring for the lnotab. This grows slowly as you add new parameters; the code object for ten parameters is 360 bytes.
It seems possible to create a hybrid of these two approaches! Here's my idea: instead of the compiler storing a code object as the annotations argument to MAKE_FUNCTION, store a tuple containing the fields you'd need to recreate the code object at runtime--bytecode, lnotab, names, consts, etc. func_get_annotations would create the code object from that, bind it to a function object, call it, and return the result. These code-object-tuples would then be automatically shared in the .pyc file and at runtime the same way that 3.10 shares the tuples of stringized annotations today.
It may be good idea if we can strip most code object members, like argcount, kwonlyargcount, nlocals, flags, freevars, cellvars, filename, name, firstlineno, linetable. It can be smaller than Python 3.9.
That said, I suggest PEP 649's memory consumption isn't an urgent consideration in choosing to accept or reject it. PEP 649 is competitive in terms of startup time and memory usage with PEP 563, and PEP 563 was accepted and shipped with several versions of Python.
I still want a real-world application/library with heavy annotation. My goal is to use annotations in the stdlib without caring about resource usage or importtime. But I agree with you if PEP 649 will be smaller than Python 3.9. Regards, -- Inada Naoki <songofacandy@gmail.com>