
On 1/11/21 2:32 PM, Łukasz Langa wrote:
1. What do you anticipate the memory usage will look like for your solution compared to PEP 563?
It depends on the scenario. I talk about three runtime scenarios in PEP 649. But the most relevant scenario is "annotations are defined but never examined", because it's by far the most common for people using annotations. So for now let's just talk about that. In this scenario, I expect PEP 649 to be on par with PEP 563. PEP 563 will define a small dict that nobody looks at; PEP 649 will define a simple code object that nobody runs. These objects consume pretty similar amounts of memory. A quick experiment: on my 64-bit Linux laptop, with a function that had three annotated parameters, sys.sizeof() of the resulting annotation dict was 232 bytes. PEP 649 generated a 176 byte code object--but we should also factor in its bytecode (45 bytes) and lnotab (33 bytes), giving us 257 bytes. (The other fields of the code object are redundant references to stuff we already had lying around.) In that case PEP 649 is slightly bigger. But if we change it to twelve annotated parameters, PEP 649 becomes a big win. The dict is now 640 bytes (!), but the code object only totals 280 bytes. It seems to flip at about five parameters; less than that, and the dict wins a little, greater than that and the code object starts winning by more and more.
2. What is your expected startup performance of an annotated Python application using co_annotations?
Again, the most relevant scenario is "annotations are defined but not referenced" so we'll stick with that. On balance it should be roughly equivalent to "PEP 563" semantics, and perhaps a teeny-tiny bit faster. With PEP 563 semantics, defining a function / class / module with annotations must build the annotations dict, then store it on the object. But all the keys and values are strings, so the bytecode isn't much--for functions, it's just a bunch of LOAD_CONSTs then a BUILD_CONST_KEY_MAP. For classes and modules it's a bit wordier, but if the bytecode performance was important here, we could probably convert it to use BUILD_CONST_KEY_MAP too. With my PEP, defining a function / class / module with annotations means you LOAD_CONST the code object, then store it on the object--and that's it. (Right now there's the whole __globals__ thing but I expect to get rid of that eventually). Of course, the code object isn't free, it has to be unmarshalled--but as code objects go these are pretty simple ones. Annotations code objects tend to have two custom bytestrings and a non-"small" int, and all the other attributes we get for free. "stock" Python semantics is a bit slower than either, because it also evaluates all the annotations at the time the function / class / module is bound. I'd love to hear real-world results from someone with a large annotated code base. Unfortunately, I'm pretty sure typing.py is broken in the prototype right now, so it's a bit early yet. (I honestly don't think it'll be that hard to get it working again, it was just one of a million things and I didn't want to hold up releasing this stuff to the world any longer.)
The stringification process which your PEP describes as costly only happens during compilation of a .py file to .pyc. Since pip-installing pre-compiles modules for the user at installation time, there is very little runtime penalty for a fully annotated application.
I never intended to suggest that the stringification process /itself/ is costly at runtime--and I don't think I did. Because, as you point out, it isn't. PEP 563 semantics writ large are costly at runtime only when annotations are examined, because you have to call eval(), and calling eval() is expensive. If the PEP does say that stringification is itself expensive at runtime, please point it out, and I'll fix it. Cheers, //arry/