Re: [Speed] intel_pstate C0 bug on isolated CPUs with the performance governor

Hi,
I made further tests and I understood better the issue.
In short, the intel_pstate driver doesn't support NOHZ_FULL, and so the frequency of CPUs using NOHZ_FULL depends on the workload of other CPUs. This is especially true when using the powersave (default) cpu frequency governor. At least, I tested on my CPU without HWP.
intel_pstate updates the Pstate of each CPU by writing into the MSR 199H. The purpose of NOHZ_FULL is to avoid any interruption, whereas intel_pstate is based on interruptions to sample performances, pick the right Pstate and write it into the MSR. To write into the MSR of the CPU 7, the kernel must run on the CPU 7. If the benchmark is CPU bound and never calls the kernel, there is no opportonity to run the intel_pstate drive.
Antoine:
Ah, well, I don't have HyperThreading on my CPU, sorry.
The bug can be reproduced without HyperThreading.
New much simpler scenario to reproduce the bug (and my analysis of the bug): https://bugzilla.redhat.com/show_bug.cgi?id=1378529#c6
2016-09-24 8:11 GMT+02:00 Armin Rigo <arigo@tunes.org>:
IMHO this is not a very good solution. With the CPU running at, say, a fifth of its nominal performance, you can't expect that it will behave in a remotely similar way.
The norminal speed is 3.4 GHz. The minimum speed is 1.6 GHz. Timings are just the double between nominal and minimum speed.
As a result, it is easy to introduce changes to the CPython core that appear beneficial, but are actually detrimental, or vice-versa. For example, replacing some computation by lookups in a table may look like a good idea, when it is not.
Yeah, maybe, I don't know.
Anyway, there are two solutions to run stable benchmarks at nominal speed:
- (Use NOHZ_FULL but) Force frequency to the maximum
- Don't use NOHZ_FULL
Victor
participants (1)
-
Victor Stinner