To potentially help provide a little bit of additional detail around our approach I've spent some time writing up our internal details of the shadow byte code implementation, and landed that in our Cinder repo here:
https://github.com/facebookincubator/cinder/blob/cinder/3.8/CinderDoc/shadowcode.rst. That might at least spark discussion or ideas about possible internal implementation details or things which could be different/more efficient in our implementation.
I've also had a version of it against 3.10 going for a while (as internally we're still at 3.8) and I've updated it to a relatively recent merge of 3.11 main. I've pushed the latest version of that here here:
https://github.com/DinoV/cpython/tree/shadowcode_rebase_2021_05_12. The 3.11 version obviously isn't as battle tested as what we've been running in production for some time now but it pretty much the same. It is missing our improved global caching which uses dictionary watches though. And it is a rather large PR (almost 7k lines) but over 1/3rd of that is the test cases.
Also just to inform the discussion around potential performance benefits, here's how that alone is currently benchmarking versus the base commit:
cpython_310_opt_rig.json
========================
Performance version: 1.0.1
Report on Linux-5.2.9-229_fbk15_hardened_4185_g357f49b36602-x86_64-with-glibc2.28
Number of logical CPUs: 48
Start date: 2021-05-17 21:57:08.095822
End date: 2021-05-17 22:40:33.374232
cpython_ghdino_opt_rig.json
===========================
Performance version: 1.0.1
Report on Linux-5.2.9-229_fbk15_hardened_4185_g357f49b36602-x86_64-with-glibc2.28
Number of logical CPUs: 48
Start date: 2021-05-21 17:25:24.410644
End date: 2021-05-21 18:02:53.524314
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| Benchmark | cpython_310_opt_rig.json | cpython_ghdino_opt_rig.json | Change | Significance |
+=========================+==========================+=============================+==============+=======================+
| 2to3 | 498 ms | 459 ms | 1.09x faster | Significant (t=15.60) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| chameleon | 13.4 ms | 12.6 ms | 1.07x faster | Significant (t=11.10) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| chaos | 163 ms | 135 ms | 1.21x faster | Significant (t=33.07) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| crypto_pyaes | 171 ms | 147 ms | 1.16x faster | Significant (t=24.93) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| deltablue | 11.7 ms | 8.38 ms | 1.40x faster | Significant (t=70.51) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| django_template | 73.7 ms | 68.1 ms | 1.08x faster | Significant (t=13.12) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| dulwich_log | 108 ms | 98.6 ms | 1.10x faster | Significant (t=18.11) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| fannkuch | 734 ms | 731 ms | 1.00x faster | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| float | 166 ms | 140 ms | 1.18x faster | Significant (t=29.38) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| go | 345 ms | 305 ms | 1.13x faster | Significant (t=31.29) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| hexiom | 14.4 ms | 13.1 ms | 1.10x faster | Significant (t=15.95) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| json_dumps | 19.6 ms | 18.1 ms | 1.09x faster | Significant (t=13.85) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| json_loads | 37.5 us | 34.8 us | 1.08x faster | Significant (t=16.23) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| logging_format | 14.5 us | 10.9 us | 1.33x faster | Significant (t=43.42) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| logging_silent | 274 ns | 238 ns | 1.15x faster | Significant (t=23.00) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| logging_simple | 13.4 us | 10.2 us | 1.31x faster | Significant (t=46.73) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| mako | 23.1 ms | 22.3 ms | 1.04x faster | Significant (t=5.78) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| meteor_contest | 151 ms | 152 ms | 1.01x slower | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| nbody | 217 ms | 208 ms | 1.04x faster | Significant (t=6.52) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| nqueens | 153 ms | 145 ms | 1.06x faster | Significant (t=10.43) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| pathlib | 29.2 ms | 24.5 ms | 1.19x faster | Significant (t=27.86) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| pickle | 14.6 us | 14.6 us | 1.00x slower | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| pickle_dict | 36.3 us | 35.4 us | 1.03x faster | Significant (t=6.24) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| pickle_list | 5.55 us | 5.44 us | 1.02x faster | Significant (t=3.42) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| pickle_pure_python | 708 us | 576 us | 1.23x faster | Significant (t=56.02) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| pidigits | 262 ms | 255 ms | 1.03x faster | Significant (t=6.37) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| pyflate | 1.02 sec | 919 ms | 1.11x faster | Significant (t=24.26) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| python_startup | 13.1 ms | 13.1 ms | 1.01x faster | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| python_startup_no_site | 8.69 ms | 8.56 ms | 1.01x faster | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| raytrace | 758 ms | 590 ms | 1.28x faster | Significant (t=62.09) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| regex_compile | 256 ms | 227 ms | 1.13x faster | Significant (t=29.88) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| regex_dna | 256 ms | 256 ms | 1.00x faster | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| regex_effbot | 4.29 ms | 4.35 ms | 1.01x slower | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| regex_v8 | 35.7 ms | 35.5 ms | 1.00x faster | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| richards | 117 ms | 98.3 ms | 1.19x faster | Significant (t=31.70) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| scimark_fft | 559 ms | 573 ms | 1.02x slower | Significant (t=-6.02) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| scimark_lu | 254 ms | 249 ms | 1.02x faster | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| scimark_monte_carlo | 162 ms | 126 ms | 1.29x faster | Significant (t=41.31) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| scimark_sor | 305 ms | 281 ms | 1.09x faster | Significant (t=19.82) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| scimark_sparse_mat_mult | 7.51 ms | 7.59 ms | 1.01x slower | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| spectral_norm | 218 ms | 220 ms | 1.01x slower | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| telco | 9.65 ms | 9.56 ms | 1.01x faster | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| unpack_sequence | 82.4 ns | 75.5 ns | 1.09x faster | Significant (t=15.12) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| unpickle | 21.0 us | 19.9 us | 1.05x faster | Significant (t=8.02) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| unpickle_list | 6.49 us | 6.76 us | 1.04x slower | Significant (t=-7.46) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| unpickle_pure_python | 494 us | 419 us | 1.18x faster | Significant (t=26.60) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| xml_etree_generate | 144 ms | 140 ms | 1.03x faster | Significant (t=3.75) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| xml_etree_iterparse | 167 ms | 159 ms | 1.04x faster | Significant (t=7.17) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| xml_etree_parse | 212 ms | 209 ms | 1.02x faster | Not significant |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
| xml_etree_process | 114 ms | 102 ms | 1.11x faster | Significant (t=16.92) |
+-------------------------+--------------------------+-----------------------------+--------------+-----------------------+
Skipped 5 benchmarks only in cpython_310_opt_rig.json: sympy_expand, sympy_integrate, sympy_str, sympy_sum, tornado_http
And here's the almost entirely non-significant memory benchmarks:
cpython_310_mem.json
====================
Performance version: 1.0.1
Report on Linux-5.2.9-229_fbk15_hardened_4185_g357f49b36602-x86_64-with-glibc2.28
Number of logical CPUs: 48
Start date: 2021-05-18 13:09:32.100009
End date: 2021-05-18 13:46:54.655953
cpython_ghdino_mem.json
=======================
Performance version: 1.0.1
Report on Linux-5.2.9-229_fbk15_hardened_4185_g357f49b36602-x86_64-with-glibc2.28
Number of logical CPUs: 48
Start date: 2021-05-19 17:17:30.891269
End date: 2021-05-20 10:44:09.117795
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| Benchmark | cpython_310_mem.json | cpython_ghdino_mem.json | Change | Significance |
+=========================+======================+=========================+===============+=======================+
| 2to3 | 21.2 MB | 21.6 MB | 1.02x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| chameleon | 16.5 MB | 16.5 MB | 1.00x smaller | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| chaos | 8303.8 kB | 8170.0 kB | 1.02x smaller | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| crypto_pyaes | 7630.8 kB | 7549.6 kB | 1.01x smaller | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| deltablue | 9620.0 kB | 9839.4 kB | 1.02x larger | Significant (t=-8.20) |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| django_template | 22.3 MB | 22.6 MB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| dulwich_log | 11.6 MB | 11.7 MB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| fannkuch | 7174.6 kB | 7195.0 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| float | 16.7 MB | 18.3 MB | 1.10x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| go | 9132.4 kB | 9170.4 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| hexiom | 8311.8 kB | 8372.6 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| json_dumps | 9406.6 kB | 9413.0 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| json_loads | 7444.0 kB | 7453.0 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| logging_format | 11.0 MB | 10.1 MB | 1.08x smaller | Significant (t=17.51) |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| logging_silent | 7651.0 kB | 7706.2 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| logging_simple | 10.3 MB | 10.4 MB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| mako | 13.7 MB | 13.9 MB | 1.02x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| meteor_contest | 9474.6 kB | 9512.0 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| nbody | 7365.4 kB | 7461.4 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| nqueens | 7471.0 kB | 7487.4 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| pathlib | 8682.4 kB | 8732.0 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| pickle | 7935.2 kB | 7942.8 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| pickle_dict | 7930.6 kB | 7933.2 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| pickle_list | 7934.2 kB | 7956.6 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| pickle_pure_python | 7962.4 kB | 7971.2 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| pidigits | 7396.4 kB | 7435.0 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| pyflate | 36.9 MB | 37.2 MB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| python_startup | 9499.6 kB | 9624.0 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| python_startup_no_site | 9479.6 kB | 9630.8 kB | 1.02x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| raytrace | 8239.0 kB | 8273.0 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| regex_compile | 8602.2 kB | 8662.6 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| regex_dna | 15.0 MB | 15.1 MB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| regex_effbot | 8054.6 kB | 8094.8 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| regex_v8 | 13.0 MB | 13.0 MB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| richards | 7837.2 kB | 7841.2 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| scimark_fft | 8037.0 kB | 8118.8 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| scimark_lu | 8059.2 kB | 8107.2 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| scimark_monte_carlo | 7968.2 kB | 8020.2 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| scimark_sor | 7995.0 kB | 8065.0 kB | 1.01x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| scimark_sparse_mat_mult | 8512.2 kB | 8549.4 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| spectral_norm | 7184.4 kB | 7217.8 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| telco | 7857.2 kB | 7672.2 kB | 1.02x smaller | Significant (t=38.26) |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| unpack_sequence | 8809.6 kB | 8835.8 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| unpickle | 7943.4 kB | 7965.8 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| unpickle_list | 7948.6 kB | 7925.6 kB | 1.00x smaller | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| unpickle_pure_python | 7922.0 kB | 7955.8 kB | 1.00x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| xml_etree_generate | 11.5 MB | 11.7 MB | 1.02x larger | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| xml_etree_iterparse | 12.1 MB | 12.0 MB | 1.01x smaller | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| xml_etree_parse | 11.6 MB | 11.5 MB | 1.01x smaller | Not significant |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+
| xml_etree_process | 12.1 MB | 12.5 MB | 1.03x larger | Significant (t=-3.04) |
+-------------------------+----------------------+-------------------------+---------------+-----------------------+