Re: [Speed] intel_pstate C0 bug on isolated CPUs with the performance governor

On Fri, 23 Sep 2016 01:19:54 +0200 Victor Stinner <victor.stinner@gmail.com> wrote:
While analyzing a performance regression ( http://bugs.python.org/issue28243 ) , I had a major issue with my benchmark. Suddenly, for no reason, after 30 minutes of benchmarking, the benchmark became 2x FASTER...
Did the benchmark really become 2x faster, or did the clock become 2x slower?
If you found a magic knob in your CPU that suddently makes it 2x faster, many people would probably like to hear about it ;-)
If you have an Intel CPU, use Linux, have a CPU with multiple physical cores and have 15 minutes to run a test, I would appreciate if you can try to reproduce the bug!
Can you tell us how to "reproduce"?
Regards
Antoine.

2016-09-23 11:23 GMT+02:00 Antoine Pitrou <solipsis@pitrou.net>:
Did the benchmark really become 2x faster, or did the clock become 2x slower?
If you found a magic knob in your CPU that suddently makes it 2x faster, many people would probably like to hear about it ;-)
He he. It's a matter of point of view :-) When I got the issue for the first time last Friday, it was like my CPU became 2x faster: https://bugzilla.redhat.com/show_bug.cgi?id=1378529#c1
I guess that for some reasons, the CPU frequency was 1.6 GHz (min frequency) even if I configured the CPU frequency governor to performance. But an unknown reason, suddenly, the governor noticed that my CPU should run at 3.4 GHz and so the benchmark "became faster".
In fact, the benchmark started at half speed (1.6 GHz), and suddenly ran at the "normal speed" (3.4 GHz).
Can you tell us how to "reproduce"?
https://bugzilla.redhat.com/show_bug.cgi?id=1378529
- Disable Turbo Boost
- Enable HyperThreading
- Isolate at least one physical CPU core (so two logical cores using HyperThreading) -- you can use "lscpu -a -e" to find a pair of logical CPUs of a physical core
- Enable NOHZ_FULL on isolated CPUs
- Use performance governor, at least for isolated CPUs, or better for all CPUs
- Run "cpupower monitor" in one terminal (cpupower comes from the package kernel-tools)
- Run a benchmark in different terminal, but pin it to one isolated CPU using "taskset -c <CPU number>"
- Wait a few seconds
- See C0 state of the isolated CPUs increasing up to 100%, whereas no process is running on these CPUs (the system is idle and the CPU usage is 0% on these CPUs)
- Then run again the benchmark on an isolated CPU
For example, I'm using CPUs 3 and 7. I interrupted the boot process (GRUB) to edit the Linux command ("linuxefi ... vmlinuz ...") to add these parameters: "... isolcpus=3,7 nohz_full=3,7" (then boot with CTRL-x). When Linux is booted, I'm running the isolcpus.py script attached to the bug report to set the governor to performance (but also mask interruptions on these CPUs).
I run the benchmark on the CPU 7 to trigger the "C0 bug" and then I run the benchmark on the CPU 3. Sometimes, I have to run the benchmark on the CPU 3 to trigger the bug.
Sometimes, the benchmark becomes slower on the CPU 3, sometimes on both CPUs, sometimes only on the CPU 7...
The exact behaviour is not really deterministic.
For a longer explanation how to reproduce the bug with "snapshots" of programs and an example of benchmark (perf timeit), see: https://bugzilla.redhat.com/show_bug.cgi?id=1378529#c0
I don't think that the benchmark matters, you only have to find a way to increase the CPU usage to 100% on one logical CPU and then stop the program to decrease the CPU usage to 0%.
Victor
participants (2)
-
Antoine Pitrou
-
Victor Stinner