[Python-Dev] Re: PEP 611: The one million limit.

Dec. 11, 2019

      On Fri, 6 Dec 2019 13:54:13 +0000
Rhodri James <rhodri@kynesim.co.uk> wrote:
...
Apologies again for commenting in the wrong place.
On 05/12/2019 16:38, Mark Shannon wrote:
...
Memory access is usually a limiting factor in the performance of 
modern CPUs. Better packing of data structures enhances locality and>
reduces memory bandwith, at a modest increase in ALU usage (for 
shifting and masking).
I don't think this assertion holds much water:
1. Caching make memory access much less of a limit than you would expect.
2. Non-aligned memory access vary from inefficient to impossible 
depending on the processor.
3. Shifting and masking isn't free, and again on some processors can be 
very expensive.
I think your knowledge is outdated.  Shifts and masks are extremely
fast on modern CPUs, and unaligned loads are fast as well (when
served from the CPU cache).  Moreover, modern CPUs are superscalar with
many different execution units, so those instructions can be executed in
parallel with other independent instructions.

However, as soon as you load from main memory because of a cache miss,
you take a hit of several hundreds cycles.  Basically, computations are
almost free compared to the cost of memory accesses.

In any case, this will have to be judged on benchmark numbers, once
Mark (or someone else) massages the interpreter to experiment with
those runtime memory footprint reductions.

Regards

Antoine.

[Python-Dev] Re: PEP 611: The one million limit.

Antoine Pitrou