[GSoC] Developing a benchmark suite (for Python 3.x)
Hello Guys, I would like to present my proposal for the Google Summer of Code, concerning the idea of porting the benchmarks to Python 3.x for speed.pypy.org. I think I have successfully integrated the feedback I got from prior discussions on the topic and I would like to hear your opinion. Abstract ======= As of now there are several benchmark suites used by Python implementations, PyPy[1] uses the benchmarks developed for the Unladen Swallow[2] project as well as several other benchmarks they implemented on their own, CPython[3] uses the Unladen Swallow benchmarks and several "crap benchmarks used for historical reasons"[4]. This makes comparisons unnecessarily hard and causes confusion. As a solution to this problem I propose merging the existing benchmarks - at least those considered worth having - into a single benchmark suite which can be shared by all implementations and ported to Python 3.x. Milestones The project can be divided into several milestones: 1. Definition of the benchmark suite. This will entail contacting developers of Python implementations (CPython, PyPy, IronPython and Jython), via discussion on the appropriate mailing lists. This might be achievable as part of this proposal. 2. Implementing the benchmark suite. Based on the prior agreed upon definition, the suite will be implemented, which means that the benchmarks will be merged into a single mercurial repository on Bitbucket[5]. 3. Porting the suite to Python 3.x. The suite will be ported to 3.x using 2to3[6], as far as possible. The usage of 2to3 will make it easier make changes to the repository especially for those still focusing on 2.x. It is to be expected that some benchmarks cannot be ported due to dependencies which are not available on Python 3.x. Those will be ignored by this project to be ported at a later time, when the necessary requirements are met. Start of Program (May 24) ====================== Before the coding, milestones 2 and 3, can begin it is necessary to agree upon a set of benchmarks, everyone is happy with, as described. Midterm Evaluation (July 12) ======================= During the midterm I want to finish the second milestone and before the evaluation I want to start in the third milestone. Final Evaluation (Aug 16) ===================== In this period the benchmark suite will be ported. If everything works out perfectly I will even have some time left, if there are problems I have a buffer here. Probably Asked Questions ====================== Why not use one of the existing benchmark suites for porting? The effort will be wasted if there is no good base to build upon, creating a new benchmark suite based upon the existing ones ensures that. Why not use Git/Bazaar/...? Mercurial is used by CPython, PyPy and is fairly well known and used in the Python community. This ensures easy accessibility for everyone. What will happen with the Repository after GSoC/How will access to the repository be handled? I propose to give administrative rights to one or two representatives of each project. Those will provide other developers with write access. Communication ============= Communication of the progress will be done via Twitter[7] and my blog[8], if desired I can also send an email with the contents of the blog post to the mailing lists of the implementations. Furthermore I am usually quick to answer via IRC (DasIch on freenode), Twitter or E-Mail(dasdasich@gmail.com) if anyone has any questions. Contact to the mentor can be established via the means mentioned above or via Skype. About Me ======== My name is Daniel Neuhäuser, I am 19 years old and currently a student at the Bergstadt-Gymnasium Lüdenscheid[9]. I started programming (with Python) about 4 years ago and became a member of the Pocoo Team[10] after successfully participating in the Google Summer of Code last year, during which I ported Sphinx[11] to Python 3.x and implemented an algorithm to diff abstract syntax trees to preserve comments and translated strings which has been used by the other GSoC projects targeting Sphinx. .. [1]: https://bitbucket.org/pypy/benchmarks/src .. [2]: http://code.google.com/p/unladen-swallow/ .. [3]: http://hg.python.org/benchmarks/file/tip/performance .. [4]: http://hg.python.org/benchmarks/file/62e754c57a7f/performance/README .. [5]: http://bitbucket.org/ .. [6]: http://docs.python.org/library/2to3.html .. [7]: http://twitter.com/#!/DasIch .. [8]: http://dasdasich.blogspot.com/ .. [9]: http://bergstadt-gymnasium.de/ .. [10]: http://www.pocoo.org/team/#daniel-neuhauser .. [11]: http://sphinx.pocoo.org/ P.S.: I would like to get in touch with the IronPython developers as well, unfortunately I was not able to find a mailing list or IRC channel is there anybody how can send me in the right direction?
I talked to Fijal about my project last night, the result is that basically the project as is, is not that interesting because the means to execute the benchmarks on multiple interpreters are currently missing. Another point we talked about was that porting the benchmarks would not be very useful as the interesting ones all have dependencies which have not (yet) been ported to Python 3.x. The first point, execution on multiple interpreters, has to be solved or this project is pretty much pointless, therefore I've changed my proposal to include just that. However the proposal still includes porting the benchmarks although this is planned to happen after the development of an application able to run the benchmarks on multiple interpreters. The reason for this is that even though the portable benchmarks might not prove to be that interesting the basic stuff for porting using 2to3 would be there, making it easier to port benchmarks in the future, as the dependencies become available under Python 3.x. However I plan to do that after implementing the prior mentioned application putting the application at higher priority. This way, should I not be able to complete all my goals, it is unlikely that anything but the porting will suffer and the project would still produce useful results during the GSoC. Anyway here is the current, updated, proposal: Abstract ======= As of now there are several benchmark suites used by Python implementations, PyPy uses the benchmarks[1] developed for the Unladen Swallow[2] project as well as several other benchmarks they implemented on their own, CPython[3] uses the Unladen Swallow benchmarks and several "crap benchmarks used for historical reasons"[4]. This makes comparisons unnecessarily hard and causes confusion. As a solution to this problem I propose merging the existing benchmarks - at least those considered worth having - into a single benchmark suite which can be shared by all implementations and ported to Python 3.x. Another problem reported by Maciej Fijalkowski is that currenly the way benchmarks are executed by PyPy is more or less a hack. Work will have to be done to allow execution of the benchmarks on different interpreters and their most recent versions (from their respective repositories). The application for this should also be able to upload the results to a codespeed instance such as http://speed.pypy.org. Milestones ========= The project can be divided into several milestones: 1. Definition of the benchmark suite. This will entail contacting developers of Python implementations (CPython, PyPy, IronPython and Jython), via discussion on the appropriate mailing lists. This might be achievable as part of this proposal. 2. Merging the benchmarks. Based on the prior agreed upon definition, the benchmarks will be merged into a single suite. 3. Implementing a system to run the benchmarks. In order to execute the benchmarks it will be necessary to have a configurable application which downloads the interpreters from their repositories, builds them and executes the benchmarks with them. 4. Porting the suite to Python 3.x. The suite will be ported to 3.x using 2to3[5], as far as possible. The usage of 2to3 will make it easier make changes to the repository especially for those still focusing on 2.x. It is to be expected that some benchmarks cannot be ported due to dependencies which are not available on Python 3.x. Those will be ignored by this project to be ported at a later time, when the necessary requirements are met. Start of Program (May 24) ====================== Before the coding, milestones 2 and 3, can begin it is necessary to agree upon a set of benchmarks, everyone is happy with, as described. Midterm Evaluation (July 12) ======================= During the midterm I want to merge the benchmarks and implement a way to execute them. Final Evaluation (Aug 16) ===================== In this period the benchmark suite will be ported. If everything works out perfectly I will even have some time left, if there are problems I have a buffer here. Implementation of the Benchmark Runner ================================== In order to run the benchmarks I propose a simple application which can be configured to download multiple interpreters, to build them and execute the benchmarks. The configuration could be similar to tox[6], downloads of the interpreters could be handled using anyvc[7]. For a site such as http://speed.pypy.org a cronjob, buildbot or whatelse is preferred, could be setup which executes the application regularly. Repository Handling ================ The code for the project will be developed in a Mercurial[8] repository hosted on Bitbucket[9], both PyPy and CPython use Mercurial and most people in the Python community should be able to use it. Probably Asked Questions ====================== Why not use one of the existing benchmark suites for porting? The effort will be wasted if there is no good base to build upon, creating a new benchmark suite based upon the existing ones ensures that. Why not use Git/Bazaar/...? Mercurial is used by CPython, PyPy and is fairly well known and used in the Python community. This ensures easy accessibility for everyone. What will happen with the Repository after GSoC/How will access to the repository be handled? I propose to give administrative rights to one or two representatives of each project. Those will provide other developers with write access. Communication ============= Communication of the progress will be done via Twitter[10] and my blog[11], if desired I can also send an email with the contents of the blog post to the mailing lists of the implementations. Furthermore I am usually quick to answer via IRC(DasIch on freenode), Twitter or E-Mail(dasdasich@gmail.com) if anyone has any questions. Contact to the mentor can be established via the means mentioned above or via Skype. About Me ======== My name is Daniel Neuhäuser, I am 19 years old and currently a student at the Bergstadt-Gymnasium Lüdenscheid[12]. I started programming (with Python) about 4 years ago and became a member of the Pocoo Team[13] after successfully participating in the Google Summer of Code last year, during which I ported Sphinx[14] to Python 3.x and implemented an algorithm to diff abstract syntax trees to preserve comments and translated strings which has been used by the other GSoC projects targeting Sphinx. .. [1]: https://bitbucket.org/pypy/benchmarks/src .. [2]: http://code.google.com/p/unladen-swallow/ .. [3]: http://hg.python.org/benchmarks/file/tip/performance .. [4]: http://hg.python.org/benchmarks/file/62e754c57a7f/performance/README .. [5]: http://docs.python.org/library/2to3.html .. [6]: http://codespeak.net/tox/ .. [7]: http://anyvc.readthedocs.org/en/latest/?redir .. [8]: http://mercurial.selenic.com/ .. [9]: https://bitbucket.org/ .. [10]: http://twitter.com/#!/DasIch .. [11]: http://dasdasich.blogspot.com/ .. [12]: http://bergstadt-gymnasium.de/ .. [13]: http://www.pocoo.org/team/#daniel-neuhauser .. [14]: http://sphinx.pocoo.org/
participants (1)
-
DasIch