Perhaps relevant for perspective: We did some review of the pyperformance benchmarks based on how noisy they are: https://github.com/faster-cpython/ideas/discussions/142 Note that pidigits is the noisiest -- its performance changes up to 11% for no good reason. The regex bms are also very noisy. On Mon, Jan 3, 2022 at 10:44 PM Gregory P. Smith <greg@krypto.org> wrote:
On Sun, Jan 2, 2022 at 2:37 AM Mark Dickinson <dickinsm@gmail.com> wrote:
On Sat, Jan 1, 2022 at 9:05 PM Antoine Pitrou <antoine@python.org> wrote:
Note that ARM is merely an architecture with very diverse implementations having quite differing performance characteristics. [...]
Understood. I'd be happy to see timings on a Raspberry Pi 3, say. I'm not too worried about things like the RPi Pico - that seems like it would be more of a target for MicroPython than CPython.
Wikipedia thinks, and the ARM architecture manuals seem to confirm, that most 32-bit ARM instruction sets _do_ support the UMULL 32-bit-by-32-bit-to-64-bit multiply instruction. (From https://en.wikipedia.org/wiki/ARM_architecture#Arithmetic_instructions: "ARM supports 32-bit × 32-bit multiplies with either a 32-bit result or 64-bit result, though Cortex-M0 / M0+ / M1 cores don't support 64-bit results.") Division may still be problematic.
It's rather irrelevant anyways, the pi zero/one is the lowest spec arm that matters at all. Nobody is ever going to ship something worse than that capable of running CPython.
Anyways I ran actual benchmarks on a pi3. On 32-bit raspbian I build CPython 3.10 with no configure flags and with --enable-big-digits (or however that's spelled) for 30-bit digits and ran pyperformance 1.0.2 on them.
Caveat: This is not a good system to run benchmarks on. widely variable performance (it has a tiny heatsink which never meaningfully got hot), and the storage is a random microsd card. Each full pyperformance run took 6 hours. :P
Results basically say: no notable difference. Most do not change and the variability (look at those stddev's and how they overlap on the few things that produced a "significant" result at all) is quite high. Things wholly unrelated to integers such as the various regex benchmarks showing up as faster demonstrate the unreliability of the numbers. And also at how pointless caring about this fine level of detail for performance is on this platform.
``` pi@pi3$ pyperf compare_to 15bit.json 30bit.json 2to3: Mean +- std dev: [15bit] 7.88 sec +- 0.39 sec -> [30bit] 8.02 sec +- 0.36 sec: 1.02x slower crypto_pyaes: Mean +- std dev: [15bit] 3.22 sec +- 0.34 sec -> [30bit] 3.40 sec +- 0.22 sec: 1.06x slower fannkuch: Mean +- std dev: [15bit] 13.4 sec +- 0.5 sec -> [30bit] 13.8 sec +- 0.5 sec: 1.03x slower pickle_list: Mean +- std dev: [15bit] 74.7 us +- 22.1 us -> [30bit] 85.7 us +- 15.5 us: 1.15x slower pyflate: Mean +- std dev: [15bit] 19.6 sec +- 0.6 sec -> [30bit] 19.9 sec +- 0.6 sec: 1.01x slower regex_dna: Mean +- std dev: [15bit] 2.99 sec +- 0.24 sec -> [30bit] 2.81 sec +- 0.22 sec: 1.06x faster regex_v8: Mean +- std dev: [15bit] 520 ms +- 71 ms -> [30bit] 442 ms +- 115 ms: 1.18x faster scimark_monte_carlo: Mean +- std dev: [15bit] 3.31 sec +- 0.24 sec -> [30bit] 3.22 sec +- 0.24 sec: 1.03x faster scimark_sor: Mean +- std dev: [15bit] 6.42 sec +- 0.34 sec -> [30bit] 6.27 sec +- 0.33 sec: 1.03x faster spectral_norm: Mean +- std dev: [15bit] 4.85 sec +- 0.31 sec -> [30bit] 4.74 sec +- 0.20 sec: 1.02x faster unpack_sequence: Mean +- std dev: [15bit] 1.42 us +- 0.42 us -> [30bit] 1.60 us +- 0.33 us: 1.13x slower
Benchmark hidden because not significant (47): chameleon, chaos, deltablue, django_template, dulwich_log, float, go, hexiom, json_dumps, json_loads, logging_format, logging_silent, logging_simple, mako, meteor_contest, nbody, nqueens, pathlib, pickle, pickle_dict, pickle_pure_python, pidigits, python_startup, python_startup_no_site, raytrace, regex_compile, regex_effbot, richards, scimark_fft, scimark_lu, scimark_sparse_mat_mult, sqlalchemy_declarative, sqlalchemy_imperative, sqlite_synth, sympy_expand, sympy_integrate, sympy_sum, sympy_str, telco, tornado_http, unpickle, unpickle_list, unpickle_pure_python, xml_etree_parse, xml_etree_iterparse, xml_etree_generate, xml_etree_process ```
rerunning a mere few of those in --rigorous mode for more runs does not significantly improve the stddev so I'm not going to let that finish.
my recommendation: proceed with removing 15-bit bignum digit support. 30-bit only future with simpler better code here we come.
-gps
-- Mark
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/F53IZRZP... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5RJGI6TH... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>