<div dir="ltr">On Thu, Sep 20, 2018 at 11:20 PM, Stefan Behnel <span dir="ltr"><<a href="mailto:stefan_ml@behnel.de" target="_blank">stefan_ml@behnel.de</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">What about the small integers cache? The C serialisation generates several<br>

PyLong objects that would normally reside in the cache. Is this handled<br>

somewhere? I guess the cache could entirely be loaded from the data<br>

segment. And the same would have to be done for interned strings. Basically<br>

anything that CPython only wants to have one instance of.<br></blockquote><div><br></div><div>Un-marshaled immutable objects are tracked in a table to ensure their uniqueness.  Thanks for mentioning the small integer cache.  It is not part of the change, but it could be brought under this framework.  By doing so, we could store the small integer objects instances in the data segment and other data segment objects could reference those unique small integer instances.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">That would severely limit the application of this optimisation to external<br>

modules, though. I don't see a way how they could load their data<br>

structures from the data segment without duplicating all sorts of "singletons".</blockquote><div><br></div><div>Yes, additional load-time work would have to be done to ensure the uniqueness of those objects.</div></div></div></div>