Hi Greg You wrote: On 19/06/20 9:28 am, Steven D'Aprano wrote:
I know very little about how this works except a vague rule of thumb that in the 21st century memory locality is king. If you want code to be fast, keep it close together, not spread out.
Python objects are already scattered all over memory, and a function already consists of several objects -- the function object itself, a dict, a code object, lists of argument and local variable names, etc. I doubt whether the suggested change would make locality noticeably worse.
I like this response. I'd also add a link to
https://en.wikipedia.org/wiki/Non-uniform_memory_access
NUMA arises because quick memory is expensive, with CPU registers the
quickest of all, and revolving hard disks (and then tape) at the slow end,
and RAM and SSD somewhere in the middle.
In addition, modern CPUs are multicore, each with its own registers and
cache. Optimising such systems is difficult, particularly as what's best
depends on the data being processed.
One part of testing would be to provide guidance as to the situations, in
which permanent code objects would bring some benefit. Even with UMA (no
caches etc), there could be the benefit of reduced use of memory (as no
longer keeping two almost identical copies of the same object).
An aside. Precisely with UMA there is no benefit to locality. I'd say
getting as much of the busy code as you can into the CPU caches will bring
benefits.
Unix introduces shared objects, and on Linux C-coded extensions are
available as shared objects.
>>> import lxml.etree as etree
>>> etree