[pypy-dev] [GSoC] Developing a benchmark suite (for Python 3.x)

DasIch dasdasich at googlemail.com
Wed Apr 6 18:52:24 CEST 2011


Hello Guys,
I would like to present my proposal for the Google Summer of Code,
concerning the idea of porting the benchmarks to Python 3.x for
speed.pypy.org. I think I have successfully integrated the feedback I
got from prior discussions on the topic and I would like to hear your
opinion.

Abstract
=======

As of now there are several benchmark suites used by Python
implementations, PyPy[1] uses the benchmarks developed for the Unladen
Swallow[2] project as well as several other benchmarks they
implemented on their own, CPython[3] uses the Unladen Swallow
benchmarks and several "crap benchmarks used for historical
reasons"[4].

This makes comparisons unnecessarily hard and causes confusion. As a
solution to this problem I propose merging the existing benchmarks -
at least those considered worth having - into a single benchmark suite
which can be shared by all implementations and ported to Python 3.x.
Milestones
The project can be divided into several milestones:

1. Definition of the benchmark suite. This will entail contacting
developers of Python implementations (CPython, PyPy, IronPython and
Jython), via discussion on the appropriate mailing lists. This might
be achievable as part of this proposal.

2. Implementing the benchmark suite. Based on the prior agreed upon
definition, the suite will be implemented, which means that the
benchmarks will be merged into a single mercurial repository on
Bitbucket[5].

3. Porting the suite to Python 3.x. The suite will be ported to 3.x
using 2to3[6], as far as possible. The usage of 2to3 will make it
easier make changes to the repository especially for those still
focusing on 2.x. It is to be expected that some benchmarks cannot be
ported due to dependencies which are not available on Python 3.x.
Those will be ignored by this project to be ported at a later time,
when the necessary requirements are met.

Start of Program (May 24)
======================

Before the coding, milestones 2 and 3, can begin it is necessary to
agree upon a set of benchmarks, everyone is happy with, as described.

Midterm Evaluation (July 12)
=======================

During the midterm I want to finish the second milestone and before
the evaluation I want to start in the third milestone.

Final Evaluation (Aug 16)
=====================

In this period the benchmark suite will be ported. If everything works
out perfectly I will even have some time left, if there are problems I
have a buffer here.

Probably Asked Questions
======================

Why not use one of the existing benchmark suites for porting?

The effort will be wasted if there is no good base to build upon,
creating a new benchmark suite based upon the existing ones ensures
that.

Why not use Git/Bazaar/...?

Mercurial is used by CPython, PyPy and is fairly well known and used
in the Python community. This ensures easy accessibility for everyone.

What will happen with the Repository after GSoC/How will access to the
repository be handled?

I propose to give administrative rights to one or two representatives
of each project. Those will provide other developers with write
access.

Communication
=============

Communication of the progress will be done via Twitter[7] and my
blog[8], if desired I can also send an email with the contents of the
blog post to the mailing lists of the implementations. Furthermore I
am usually quick to answer via IRC (DasIch on freenode), Twitter or
E-Mail(dasdasich at gmail.com) if anyone has any questions.

Contact to the mentor can be established via the means mentioned above
or via Skype.

About Me
========

My name is Daniel Neuhäuser, I am 19 years old and currently a student
at the Bergstadt-Gymnasium Lüdenscheid[9]. I started programming (with
Python) about 4 years ago and became a member of the Pocoo Team[10]
after successfully participating in the Google Summer of Code last
year, during which I ported Sphinx[11] to Python 3.x and implemented
an algorithm to diff abstract syntax trees to preserve comments and
translated strings which has been used by the other GSoC projects
targeting Sphinx.

.. [1]: https://bitbucket.org/pypy/benchmarks/src
.. [2]: http://code.google.com/p/unladen-swallow/
.. [3]: http://hg.python.org/benchmarks/file/tip/performance
.. [4]: http://hg.python.org/benchmarks/file/62e754c57a7f/performance/README
.. [5]: http://bitbucket.org/
.. [6]: http://docs.python.org/library/2to3.html
.. [7]: http://twitter.com/#!/DasIch
.. [8]: http://dasdasich.blogspot.com/
.. [9]: http://bergstadt-gymnasium.de/
.. [10]: http://www.pocoo.org/team/#daniel-neuhauser
.. [11]: http://sphinx.pocoo.org/

P.S.: I would like to get in touch with the IronPython developers as
well, unfortunately I was not able to find a mailing list or IRC
channel is there anybody how can send me in the right direction?



More information about the Pypy-dev mailing list