[Python-checkins] BAD Benchmark Results for Python Default 2016-01-26

Tue Jan 26 13:48:54 EST 2016

Wow, what happened to Python default to cause such a regression?

On 1/26/16, 7:31 AM, "lp_benchmark_robot" <lp_benchmark_robot at intel.com> wrote:

>Results for project Python default, build date 2016-01-26 03:07:40 +0000
>commit:		cbd4a6a2657e
>previous commit:	f700bc0412bc
>revision date:	2016-01-26 02:54:37 +0000
>environment:	Haswell-EP
>	cpu:		Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz 2x18 cores, stepping 2, LLC 45 MB
>	mem:		128 GB
>	os:		CentOS 7.1
>	kernel:	Linux 3.10.0-229.4.2.el7.x86_64
>
>Baseline results were generated using release v3.4.3, with hash b4cbecbc0781
>from 2015-02-25 12:15:33+00:00
>
>----------------------------------------------------------------------------------
>              benchmark   relative   change since   change since   current rev run
>                          std_dev*       last run       baseline          with PGO
>----------------------------------------------------------------------------------
>:-)           django_v2      0.21%         -2.93%          8.95%            16.19%
>:-|             pybench      0.10%          0.05%         -1.87%             5.40%
>:-(            regex_v8      2.72%         -0.02%         -4.67%             4.57%
>:-|               nbody      0.13%         -0.92%         -1.33%             7.40%
>:-|        json_dump_v2      0.20%          0.87%         -1.59%            11.48%
>:-|      normal_startup      0.90%         -0.57%          0.10%             5.35%
>----------------------------------------------------------------------------------
>* Relative Standard Deviation (Standard Deviation/Average)
>
>If this is not displayed properly please visit our results page here: http://languagesperformance.intel.com/bad-benchmark-results-for-python-default-2016-01-26/
>
>Note: Benchmark results are measured in seconds.
>
>Subject Label Legend:
>Attributes are determined based on the performance evolution of the workloads
>compared to the previous measurement iteration.
>NEUTRAL: performance did not change by more than 1% for any workload
>GOOD: performance improved by more than 1% for at least one workload and there
>is no regression greater than 1%
>BAD: performance dropped by more than 1% for at least one workload and there is
>no improvement greater than 1%
>UGLY: performance improved by more than 1% for at least one workload and also
>dropped by more than 1% for at least one workload
>
>
>Our lab does a nightly source pull and build of the Python project and measures
>performance changes against the previous stable version and the previous nightly
>measurement. This is provided as a service to the community so that quality
>issues with current hardware can be identified quickly.
>
>Intel technologies' features and benefits depend on system configuration and may
>require enabled hardware, software or service activation. Performance varies
>depending on system configuration.