[Speed] Are benchmarks and libraries mutable?

Sun Sep 2 09:39:27 CEST 2012

On Sun, Sep 2, 2012 at 12:10 AM, Brett Cannon <brett at python.org> wrote:
>
>
> On Sat, Sep 1, 2012 at 2:57 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>>
>> On Sat, 1 Sep 2012 13:21:36 -0400
>> Brett Cannon <brett at python.org> wrote:
>> >
>> > One is moving benchmarks from PyPy over to the unladen repo on
>> > hg.python.org/benchmarks. But I wanted to first make sure people don't
>> > view
>> > the benchmarks as immutable (e.g. as Octane does:
>> > https://developers.google.com/octane/faq). Since the benchmarks are
>> > always
>> > relative between two interpreters their immutability isn't critical
>> > compared to if we were to report some overall score. But it also means
>> > that
>> > any changes made would throw off historical comparisons. For instance,
>> > if I
>> > take PyPy's Mako benchmark (which does a lot more work), should it be
>> > named
>> > mako_v2, or should we just replace mako wholesale?
>>
>> mako_v2 sounds fine to me. Mutating benchmarks makes things confusing:
>> one person may report that interpreter A is faster than interpreter B
>> on a given benchmark, and another person retort that no, interpreter B
>> is faster than interpreter A.
>>
>> Besides, if you want to have useful timelines on speed.p.o, you
>> definitely need stable benchmarks.
>>
>> > And the second is the same question for libraries. For instance, the
>> > unladen benchmarks have Django 1.1a0 as the version which is rather
>> > ancient. And with 1.5 coming out with provisional Python 3 support I
>> > obviously would like to update it. But the same questions as with
>> > benchmarks crops up in reference to immutability.
>>
>> django_v2 sounds fine too :)
>
>
> True, but having to carry around multiple copies of libraries just becomes a
> pain.

You just kill django when you introduce django v2 (alternatively you
remove the history and keep the name django). Historical outdated
benchmarks are not as interesting.

>
>>
>>
>> > (e.g. I will have to probably update the 2.7 code to use
>> > io.BytesIO instead of StringIO.StringIO to be on more equal footing).
>>
>> I disagree. If io.BytesIO is faster than StringIO.StringIO then it's
>> normal for the benchmark results to reflect that (ditto if it's slower).
>>
>> > If we can't find a reasonable way to handle all of this then what I will
>> > do
>> > is branch the unladen benchmarks for 2.x/3.x benchmarking, and then
>> > create
>> > another branch of the benchmark suite to just be for Python 3.x so that
>> > we
>> > can start fresh with a new set of benchmarks that will never change
>> > themselves for benchmarking Python 3 itself.
>>
>> Why not simply add Python 3-specific benchmarks to the mix?
>> You can then create a "py3" benchmark suite in perf.py (and perhaps
>> also a "py2" one).
>
>
> To avoid historical baggage and to start from a clean slate. I don't
> necessarily want to carry around Python 2 benchmarks forever. It's not a
> massive concern, just a nicety.

If you guys want to have any cooperation with us, you have to carry
Python 2 benchmarks for indefinite amount of time.

Cheers,
fijal