Mailman 3 Benchmark results across all major Python implementations - Python-Dev

newer
Re: [Python-Dev] [Python-checkins]...

Benchmark results across all major Python implementations

older
Re: [Python-Dev] [Python-checkins]...

Brett Cannon

16 Nov 2015 16 Nov '15

10:18 p.m.

I gave the opening keynote at PyCon CA and then gave the same talk at PyData NYC on the various interpreters of Python (Jupyter notebook of my presentation can be found at bit.ly/pycon-ca-keynote; no video yet). I figured people here might find the benchmark numbers interesting so I'm sharing the link here. I'm still hoping someday speed.python.org becomes a thing so I never have to spend so much time benchmarking so may Python implementations ever again and this sort of thing is just part of what we do to keep the implementation ecosystem healthy.

Attachments:

attachment.htm (text/html — 693 bytes)

Show replies by date

Maciej Fijalkowski

16 Nov 16 Nov

10:23 p.m.

Hi Brett Any thoughts on improving the benchmark set (I think all of {cpython,pypy,pyston} introduced new benchmarks to the set). "speed.python.org" becoming a thing is generally stopped on "noone cares enough to set it up". Cheers, fijal On Mon, Nov 16, 2015 at 9:18 PM, Brett Cannon wrote:

...

I gave the opening keynote at PyCon CA and then gave the same talk at PyData NYC on the various interpreters of Python (Jupyter notebook of my presentation can be found at bit.ly/pycon-ca-keynote; no video yet). I figured people here might find the benchmark numbers interesting so I'm sharing the link here.

I'm still hoping someday speed.python.org becomes a thing so I never have to spend so much time benchmarking so may Python implementations ever again and this sort of thing is just part of what we do to keep the implementation ecosystem healthy.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

R. David Murray

10:47 p.m.

On Mon, 16 Nov 2015 21:23:49 +0100, Maciej Fijalkowski wrote:

...

Any thoughts on improving the benchmark set (I think all of {cpython,pypy,pyston} introduced new benchmarks to the set). "speed.python.org" becoming a thing is generally stopped on "noone cares enough to set it up".

Actually, with some help from Intel, it is getting there. You can see the 'benchmarks' entry in the buildbot console: http://buildbot.python.org/all/console but it isn't quite working yet. We are also waiting on a review of the salt state: https://github.com/python/psf-salt/pull/74 (All work done by Zach Ware.) --David

Brett Cannon

10:51 p.m.

On Mon, 16 Nov 2015 at 12:24 Maciej Fijalkowski wrote:

...

Hi Brett

Any thoughts on improving the benchmark set (I think all of {cpython,pypy,pyston} introduced new benchmarks to the set).

We should probably start a mailing list and finally hash out a common set of benchmarks that we all agree are reasonable for measuring performance. I think we all generally agree that high-level benchmarks are good and micro-benchmarks aren't that important for cross-implementation comparisons (obviously they have a usefulness when trying to work on a specific feature set, but that should be considered specific to an implementation and not to some globally accepted set of benchmarks). So we should identify benchmarks which somewhat represent real world workloads and try to have a balanced representation that doesn't lean one way or another (e.g., not all string manipulation or scientific computing, but both should obviously be represented).

...

"speed.python.org" becoming a thing is generally stopped on "noone cares enough to set it up".

Oh, I know. I didn't say this could be considered wishful thinking since I know I have enough on my plate to prevent me from making it happen. -Brett

...

Cheers, fijal

On Mon, Nov 16, 2015 at 9:18 PM, Brett Cannon wrote:

...
I gave the opening keynote at PyCon CA and then gave the same talk at PyData NYC on the various interpreters of Python (Jupyter notebook of my presentation can be found at bit.ly/pycon-ca-keynote; no video yet). I figured people here might find the benchmark numbers interesting so I'm sharing the link here.

I'm still hoping someday speed.python.org becomes a thing so I never have to spend so much time benchmarking so may Python implementations ever again and this sort of thing is just part of what we do to keep the implementation ecosystem healthy.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

Brian Curtin

11:38 p.m.

On Monday, November 16, 2015, Brett Cannon javascript:_e(%7B%7D,'cvml','brett@python.org');> wrote:

...

On Mon, 16 Nov 2015 at 12:24 Maciej Fijalkowski wrote:

...
Hi Brett

Any thoughts on improving the benchmark set (I think all of {cpython,pypy,pyston} introduced new benchmarks to the set).

We should probably start a mailing list

There is/was a speed@python.org list.

...

"speed.python.org" becoming a thing is generally stopped on "noone

...
cares enough to set it up".

Oh, I know. I didn't say this could be considered wishful thinking since I know I have enough on my plate to prevent me from making it happen.

There was a grant given years ago to improve some of this stuff but I don't believe the work ever saw the light of day.

Zachary Ware

17 Nov 17 Nov

12:23 a.m.

On Mon, Nov 16, 2015 at 3:38 PM, Brian Curtin wrote:

...

On Monday, November 16, 2015, Brett Cannon wrote:

...

...
...
Hi Brett

Any thoughts on improving the benchmark set (I think all of {cpython,pypy,pyston} introduced new benchmarks to the set).

We should probably start a mailing list

There is/was a speed@python.org list.

Is, but seems to be heavily moderated without active moderation. I sent an offer to speed-owner this morning to help with moderation, but as yet no response. I know I have a couple of messages waiting in the queue :) On Mon, 16 Nov 2015 at 12:24 Maciej Fijalkowski wrote:

...

"speed.python.org" becoming a thing is generally stopped on "noone cares enough to set it up".

Setup is done in my dev environment, it's now blocking on people more qualified than me to review and merge, then final tweaking of the buildbot setup. For those interested: The part in most need of review is the changes to the PSF Salt configuration to set up and run Codespeed on speed.python.org. The changes can be found in PR #74 on the psf-salt Github repo[0]. The Codespeed instance is housed in a Github repo owned by me[1] currently. There's one small patch to the codespeed code (which was merged upstream this morning), the rest of the changes in my fork are adapting a copy of the sample_project to be our own instance. I've been told that the grant proposal from long ago expected the use of codespeed2 instead of codespeed. I chose codespeed over codespeed2 because it appeared to be easier to get set up and running (which may or may not have actually been true), but also because I've not seen codespeed2 in action anywhere. The final piece that could use review is the changes to our benchmark repository, currently available in a sandbox repo on hg.python.org[2]. My original plan had been to use PyPy's benchmark suite, but as you can tell from the logs of the existing buildslave, cpython doesn't run that suite especially well, and the cpython suite has the advantage of working with Python 2 and 3 out of the box. Please have a look, give it a try if you can, and let me know what needs improvement! [0] https://github.com/python/psf-salt/pull/74 [1] https://github.com/zware/codespeed [2] https://hg.python.org/sandbox/zware-benchmarks -- Zach

Tim Golden

12:34 a.m.

On 16/11/2015 22:23, Zachary Ware wrote:

...

On Mon, Nov 16, 2015 at 3:38 PM, Brian Curtin wrote:

...
On Monday, November 16, 2015, Brett Cannon wrote:

...
...
...
Hi Brett

Any thoughts on improving the benchmark set (I think all of {cpython,pypy,pyston} introduced new benchmarks to the set).

We should probably start a mailing list

There is/was a speed@python.org list.

Is, but seems to be heavily moderated without active moderation. I sent an offer to speed-owner this morning to help with moderation, but as yet no response. I know I have a couple of messages waiting in the queue :)

Just to help things along, I've added myself as list owner and released a bunch of messages. TJG

Brian Curtin

12:16 a.m.

On Monday, November 16, 2015, Brett Cannon javascript:_e(%7B%7D,'cvml','brett@python.org');> wrote:

...

On Mon, 16 Nov 2015 at 12:24 Maciej Fijalkowski wrote:

...
Hi Brett

Any thoughts on improving the benchmark set (I think all of {cpython,pypy,pyston} introduced new benchmarks to the set).

We should probably start a mailing list

There is/was a speed@python.org list.

...

"speed.python.org" becoming a thing is generally stopped on "noone

...
cares enough to set it up".

Oh, I know. I didn't say this could be considered wishful thinking since I know I have enough on my plate to prevent me from making it happen.

There was a grant given years ago to improve some of this stuff but I don't believe the work ever saw the light of day.

Jim Baker

12:59 a.m.

Brett, Very cool, I'm glad to see that Jython's performance was competitive under most of these benchmarks. I would also be interested in joining the proposed mailing list. re elementtree - I assume the benchmarking is usually done with cElementTree. However Jython currently lacks a Java equivalent, so importing cElementTree just uses the pure Python version. Hence the significant performance difference of approx. 40x for etree_parse and 16x for etree_iterparse. - Jim On Mon, Nov 16, 2015 at 1:18 PM, Brett Cannon wrote:

...

I gave the opening keynote at PyCon CA and then gave the same talk at PyData NYC on the various interpreters of Python (Jupyter notebook of my presentation can be found at bit.ly/pycon-ca-keynote; no video yet). I figured people here might find the benchmark numbers interesting so I'm sharing the link here.

I'm still hoping someday speed.python.org becomes a thing so I never have to spend so much time benchmarking so may Python implementations ever again and this sort of thing is just part of what we do to keep the implementation ecosystem healthy.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/jbaker%40zyasoft.com

Stewart, David C

1:37 a.m.

Last June we started publishing a daily performance report of the latest Python tip against the previous day's run and some established synch point. We mail these to the community to act as a "canary in the coal mine." I wrote about it at https://01.org/lp/blog/0-day-challenge-what-pulse-internet You can see our manager-style dashboard of a couple of key workloads at http://languagesperformance.intel.com/ (I have this running constantly on a dedicated screen in my office). Would love to get better workloads if we can. Dave From: Python-Dev on behalf of Brett Cannon Date: Monday, November 16, 2015 at 12:18 PM To: python-dev Subject: [Python-Dev] Benchmark results across all major Python implementations I gave the opening keynote at PyCon CA and then gave the same talk at PyData NYC on the various interpreters of Python (Jupyter notebook of my presentation can be found at bit.ly/pycon-ca-keynotehttp://bit.ly/pycon-ca-keynote; no video yet). I figured people here might find the benchmark numbers interesting so I'm sharing the link here. I'm still hoping someday speed.python.orghttp://speed.python.org becomes a thing so I never have to spend so much time benchmarking so may Python implementations ever again and this sort of thing is just part of what we do to keep the implementation ecosystem healthy.

R. David Murray

8:40 p.m.

On Mon, 16 Nov 2015 23:37:06 +0000, "Stewart, David C" wrote:

...

Last June we started publishing a daily performance report of the latest Python tip against the previous day's run and some established synch point. We mail these to the community to act as a "canary in the coal mine." I wrote about it at https://01.org/lp/blog/0-day-challenge-what-pulse-internet

You can see our manager-style dashboard of a couple of key workloads at http://languagesperformance.intel.com/ (I have this running constantly on a dedicated screen in my office).

Just took a look at this. Pretty cool. The web page is a bit confusing, though. It doesn't give any clue as to what is being measured by the numbers presented...it isn't obvious whether those downward sloping lines represent progress or regression. Also, the intro talks about historical data, but other than the older dates[*] in the graph there's no access to it. Do you have plans to provide access to the raw data? It also doesn't show all of the test shown in the example email in your blog post or the emails to python-checkins...do you plan to make those graphs available in the future as well? Also, in the emails, what is the PGO column percentage relative to? I suppose that for this to have maximum effect someone would have to specifically be paying attention to performance and figuring out why every (real) regression happened. I don't suppose we have anyone in the community currently who is taking on that role, though we certainly do have people who are *interested* in Python performance :) --David [*] Personally I'd find it easier to read those dates in MM-DD form, but I suppose that's a US quirk, since in the US when using slashes the month comes first...

Stewart, David C

9:22 p.m.

+Stefan (owner of the 0-day lab) On 11/17/15, 10:40 AM, "Python-Dev on behalf of R. David Murray" wrote:

...

On Mon, 16 Nov 2015 23:37:06 +0000, "Stewart, David C" wrote:

...
Last June we started publishing a daily performance report of the latest Python tip against the previous day's run and some established synch point. We mail these to the community to act as a "canary in the coal mine." I wrote about it at https://01.org/lp/blog/0-day-challenge-what-pulse-internet

You can see our manager-style dashboard of a couple of key workloads at http://languagesperformance.intel.com/ (I have this running constantly on a dedicated screen in my office).

Just took a look at this. Pretty cool. The web page is a bit confusing, though. It doesn't give any clue as to what is being measured by the numbers presented...it isn't obvious whether those downward sloping lines represent progress or regression. Also, the intro talks about historical data, but other than the older dates[*] in the graph there's no access to it. Do you have plans to provide access to the raw data? It also doesn't show all of the test shown in the example email in your blog post or the emails to python-checkins...do you plan to make those graphs available in the future as well?

The data on this website has been normalized so "up" is "good" so far as the slope of the line. The daily email has a lot more detail about the hardware and software configuration and the versions being compared. We run workloads multiple times and visually show the relative standard distribution on the graph. No plans to show the raw data. I think showing multiple workloads graphically sounds useful, we should look into that.

...

Also, in the emails, what is the PGO column percentage relative to?

It's the performance boost on the current rev from just using PGO. Another way to think about it is, this is the performance that you leave on the table by *not* building Cpython with PGO. For example, from last night's run, we would see an 18.54% boost in django_v2 by building Python using PGO. Note: PGO is not the default way to build Python because it is relatively slow to compile it that way. (I think it should be the default). Here are the instructions for using it (thanks to Peter Wang for the instructions): hg clone https://hg.python.org/cpython cpython cd cpython hg update 2.7 ./configure make profile-opt

...

I suppose that for this to have maximum effect someone would have to specifically be paying attention to performance and figuring out why every (real) regression happened. I don't suppose we have anyone in the community currently who is taking on that role, though we certainly do have people who are *interested* in Python performance :)

We're trying to fill that role as much as we can. When there is a significant (and unexplained) regression that we see, I usually ask our engineers to bisect it to identify the offending patch and root-cause it.

...

--David

[*] Personally I'd find it easier to read those dates in MM-DD form, but I suppose that's a US quirk, since in the US when using slashes the month comes first...

You and me both. As you surmised, the site was developed by our friends in Europe. :-)

...

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/david.c.stewart%40intel.c...

Popa, Stefan A

18 Nov 18 Nov

8:10 a.m.

Hi Python community, Thank you for your feedback! We will look into this and come up with an e-mail format proposal in the following days. Best regards, -- Stefan A. POPA Software Engineering Manager System Technologies and Optimization Division Software Services Group, Intel Romania

...

On 17 Nov 2015, at 21:22, Stewart, David C wrote:

+Stefan (owner of the 0-day lab)

...
On 11/17/15, 10:40 AM, "Python-Dev on behalf of R. David Murray" wrote:

...
On Mon, 16 Nov 2015 23:37:06 +0000, "Stewart, David C" wrote: Last June we started publishing a daily performance report of the latest Python tip against the previous day's run and some established synch point. We mail these to the community to act as a "canary in the coal mine." I wrote about it at https://01.org/lp/blog/0-day-challenge-what-pulse-internet

You can see our manager-style dashboard of a couple of key workloads at http://languagesperformance.intel.com/ (I have this running constantly on a dedicated screen in my office).

Just took a look at this. Pretty cool. The web page is a bit confusing, though. It doesn't give any clue as to what is being measured by the numbers presented...it isn't obvious whether those downward sloping lines represent progress or regression. Also, the intro talks about historical data, but other than the older dates[*] in the graph there's no access to it. Do you have plans to provide access to the raw data? It also doesn't show all of the test shown in the example email in your blog post or the emails to python-checkins...do you plan to make those graphs available in the future as well?

The data on this website has been normalized so "up" is "good" so far as the slope of the line. The daily email has a lot more detail about the hardware and software configuration and the versions being compared. We run workloads multiple times and visually show the relative standard distribution on the graph.

No plans to show the raw data.

I think showing multiple workloads graphically sounds useful, we should look into that.

...
Also, in the emails, what is the PGO column percentage relative to?

It's the performance boost on the current rev from just using PGO. Another way to think about it is, this is the performance that you leave on the table by *not* building Cpython with PGO. For example, from last night's run, we would see an 18.54% boost in django_v2 by building Python using PGO.

Note: PGO is not the default way to build Python because it is relatively slow to compile it that way. (I think it should be the default).

Here are the instructions for using it (thanks to Peter Wang for the instructions):

hg clone https://hg.python.org/cpython cpython cd cpython hg update 2.7 ./configure make profile-opt

...
I suppose that for this to have maximum effect someone would have to specifically be paying attention to performance and figuring out why every (real) regression happened. I don't suppose we have anyone in the community currently who is taking on that role, though we certainly do have people who are *interested* in Python performance :)

We're trying to fill that role as much as we can. When there is a significant (and unexplained) regression that we see, I usually ask our engineers to bisect it to identify the offending patch and root-cause it.

...
--David

[*] Personally I'd find it easier to read those dates in MM-DD form, but I suppose that's a US quirk, since in the US when using slashes the month comes first...

You and me both. As you surmised, the site was developed by our friends in Europe. :-)

...
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/david.c.stewart%40intel.c...

R. David Murray

6:24 p.m.

On 17 Nov 2015, at 21:22, Stewart, David C wrote:

...

On 11/17/15, 10:40 AM, "Python-Dev on behalf of R. David Murray" wrote:

...
I suppose that for this to have maximum effect someone would have to specifically be paying attention to performance and figuring out why every (real) regression happened. I don't suppose we have anyone in the community currently who is taking on that role, though we certainly do have people who are *interested* in Python performance :)

We're trying to fill that role as much as we can. When there is a significant (and unexplained) regression that we see, I usually ask our engineers to bisect it to identify the offending patch and root-cause it.

That's great news. --David

Stephen J. Turnbull

9:11 a.m.

Stewart, David C writes:

...

Note: PGO is not the default way to build Python because it is relatively slow to compile it that way. (I think it should be the default).

+1 Slow-build-fast-run should be the default if you're sure the optimization works. Only developers are likely to run a given build few enough times to save seconds, and most people are like to turn to some other task as soon as they type "make". It's a slightly different use case, but in XEmacs we have a --quick-build configure option which means that the "usual targets" don't rebuild a bunch of auxiliary targets (mostly documentation and development infrastructure such as xref caches). Never heard a complaint about that either from the developers (who learned to use --quick-build easily enough) or the beta testers (who do remark on long build times, but only once a week or so for most of them).

3080

Age (days ago)

3082

Last active (days ago)

List overview

Download

14 comments

10 participants

participants (10)

Brett Cannon
Brian Curtin
Jim Baker
Maciej Fijalkowski
Popa, Stefan A
R. David Murray
Stephen J. Turnbull
Stewart, David C
Tim Golden
Zachary Ware

Benchmark results across all major Python implementations

Popa, Stefan A

tags

participants (10)