[Python-ideas] Less is more? Smaller code and data to fit more into the CPU cache?

March 22, 2022

      Hi

As you may have seen, AMD has recently announced CPUs that have much larger
L3 caches. Does anyone know of any work that's been done to research or
make critical Python code and data smaller so that more of it fits in the
CPU cache? I'm particularly interested in measured benefits.

This search
  https://www.google.com/search?q=python+performance+CPU+cache+size
provides two relevant links

https://www.oreilly.com/library/view/high-performance-python/9781449361747/c...

https://www.dlr.de/sc/Portaldata/15/Resources/dokumente/pyhpc2016/slides/PyH...
but not much else I found relevant.

AnandTech writes about the chips with triple the L3 cache:

https://www.anandtech.com/show/17323/amd-releases-milan-x-cpus-with-3d-vcach...
"As with other chips that incorporate larger caches, the greatest benefits
are going to be found in workloads that spill out of contemporary-sized
caches, but will neatly fit into the larger cache."

And also:

https://www.anandtech.com/show/17313/ryzen-7-5800x3d-launches-april-20th-plu...
" As detailed by the company back at CES 2022 and reiterated in today’s
announcement, AMD has found that the chip is 15% faster at gaming than
their Ryzen 9 5900X."

I already know that using Non Uniform Memory Access (NUMA) raises the
difficult problem of cache coherence.
  https://en.wikipedia.org/wiki/Non-uniform_memory_access
  https://en.wikipedia.org/wiki/Cache_coherence

-- 
Jonathan

[Python-ideas] Less is more? Smaller code and data to fit more into the CPU cache?

Jonathan Fine