Send pypy-dev mailing list submissions to
pypy-dev@codespeak.net
To subscribe or unsubscribe via the World Wide Web, visit
http://codespeak.net/mailman/listinfo/pypy-dev
or, via email, send a message with subject or body 'help' to
pypy-dev-request@codespeak.net
You can reach the person managing the list at
pypy-dev-owner@codespeak.net
When replying, please edit your Subject line so it is more specific
than "Re: Contents of pypy-dev digest..."
Today's Topics:
1. PyPy 1.3 released (Maciej Fijalkowski)
2. Re: New speed.pypy.org version (Miquel Torres)
3. Re: PyPy 1.3 released (Armin Rigo)
4. PyPy Master thesis sandboxing (Carl Friedrich Bolz)
5. Re: PyPy Master thesis sandboxing (Maciej Fijalkowski)
6. Re: PyPy Master thesis sandboxing (Carl Friedrich Bolz)
7. Re: PyPy Master thesis sandboxing (Maciej Fijalkowski)
8. Re: PyPy Master thesis sandboxing (S?ren Laursen)
----------------------------------------------------------------------
Message: 1
Date: Fri, 25 Jun 2010 17:27:52 -0600
From: Maciej Fijalkowski <fijall@gmail.com>
Subject: [pypy-dev] PyPy 1.3 released
To: PyPy Dev <pypy-dev@codespeak.net>, "<python-dev@python.org>"
<python-dev@python.org>, python-announce@python.org
Message-ID:
<AANLkTikaN3p6BNFUfXL4RlWB28ZwjrajeUb34r8SGvdy@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
=======================
PyPy 1.3: Stabilization
=======================
Hello.
We're please to announce release of PyPy 1.3. This release has two major
improvements. First of all, we stabilized the JIT compiler since 1.2 release,
answered user issues, fixed bugs, and generally improved speed.
We're also pleased to announce alpha support for loading CPython extension
modules written in C. While the main purpose of this release is increased
stability, this feature is in alpha stage and it is not yet suited for
production environments.
Highlights of this release
==========================
* We introduced support for CPython extension modules written in C. As of now,
this support is in alpha, and it's very unlikely unaltered C extensions will
work out of the box, due to missing functions or refcounting details. The
support is disable by default, so you have to do::
import cpyext
before trying to import any .so file. Also, libraries are source-compatible
and not binary-compatible. That means you need to recompile binaries, using
for example::
python setup.py build
Details may vary, depending on your build system. Make sure you include
the above line at the beginning of setup.py or put it in your PYTHONSTARTUP.
This is alpha feature. It'll likely segfault. You have been warned!
* JIT bugfixes. A lot of bugs reported for the JIT have been fixed, and its
stability greatly improved since 1.2 release.
* Various small improvements have been added to the JIT code, as well as a great
speedup of compiling time.
Cheers,
Maciej Fijalkowski, Armin Rigo, Alex Gaynor, Amaury Forgeot d'Arc and
the PyPy team
------------------------------
Message: 2
Date: Sat, 26 Jun 2010 09:16:52 +0200
From: Miquel Torres <tobami@googlemail.com>
Subject: Re: [pypy-dev] New speed.pypy.org version
To: Paolo Giarrusso <p.giarrusso@gmail.com>
Cc: pypy-dev <pypy-dev@codespeak.net>
Message-ID:
<AANLkTilgNm7a3o9pSUDgbHwxJ2bNqsYNajUK21JnFBT7@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Hi Paolo,
well, you are right of course. I had forgotten about the real problem, which
you actually demonstrate quite well with your CPython and pypy-c case:
depending on the normalization you can make any stacked series look faster
than the others.
I will have a look at the literature and modify normalized stacked plots
accordingly.
Thanks for taking the time to explain things in such detail.
Regards,
Miquel
2010/6/25 Paolo Giarrusso <p.giarrusso@gmail.com>
On Fri, Jun 25, 2010 at 19:08, Miquel Torres <tobami@googlemail.com>
wrote:
Hi Paolo,
I am aware of the problem with calculating benchmark means, but let me
explain my point of view.
You are correct in that it would be preferable to have absolute times.
Well,
you actually can, but see what it happens:
http://speed.pypy.org/comparison/?hor=true&bas=none&chart=stacked+bars
Ahah! I didn't notice that I could skip normalization! This does not
fully invalidate my point, however.
Absolute values would only work if we had carefully chosen benchmaks
runtimes to be very similar (for our cpython baseline). As it is,
html5lib,
spitfire and spitfire_cstringio completely dominate the cummulative time.
I acknowledge that (btw, it should be cumulative time, with one 'm',
both here and in the website).
And not because the interpreter is faster or slower but because the
benchmark was arbitrarily designed to run that long. Any improvement in
the
long running benchmarks will carry much more weight than in the short
running.
What is more useful is to have comparable slices of time so that the
improvements can be seen relatively over time.
If you want to sum up times (but at this point, I see no reason for
it), you should rather have externally derived weights, as suggested
by the paper (in Rule 3).
As soon as you take weights from the data, lots of maths that you need
is not going to work any more - that's generally true in many cases in
statistics.
And the only way making sense to have external weights is to gather
them from real world programs. Since that's not going to happen
easily, just stick with the geometric mean. Or set an arbitrarily low
weight, manually, without any math, so that the long-running
benchmarks stop dominating the res. It's no fraud, since the current
graph is less valid anyway.
Normalizing does that i
think.
Not really.
It just says: we have 21 tasks which take 1 second to run each on
interpreter X (cpython in the default case). Then we see how other
executables compare to that. What would the geometric mean achieve here,
exactly, for the end user?
You actually need the geomean to do that. Don't forget that the
geomean is still a mean: it's a mean performance ratio which averages
individual performance ratios.
If PyPy's geomean is 0.5, it means that PyPy is going to run that task
in 11.5 seconds instead of 21. To me, this sounds exactly like what
you want to achieve. Moreover, it actually works, unlike what you use.
For instance, ignore PyPy-JIT, and look only CPython and pypy-c (no
JIT). Then, change the normalization among the two:
http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=2%2B35&chart=stacked+bars
http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=3%2BL&chart=stacked+bars
with the current data, you get that in one case cpython is faster, in
the other pypy-c is faster.
It can't happen with the geomean. This is the point of the paper.
I could even construct a normalization baseline $base such that
CPython seems faster than PyPy-JIT. Such a base should be very fast
on, say, ai (where CPython is slower), so that $cpython.ai/$base.ai
becomes 100 and $pypyjit.ai/$base.ai becomes 200, and be very slow on
other benchmarks (so that they disappear in the sum).
So, the only difference I see is that geomean works, arithm. mean
doesn't. That's why Real Benchmarkers use geomean.
Moreover, you are making a mistake quite common among non-physicists.
What you say makes sense under the implicit assumption that dividing
two times gives something you can use as a time. When you say "Pypy's
runtime for a 1 second task", you actually want to talk about a
performance ratio, not about the time. In the same way as when you say
"this bird runs 3 meters long in one second", a physicist would sum
that up as "3 m/s" rather than "3 m".
I am not really calculating any mean. You can see that I carefully
avoided
to display any kind of total bar which would indeed incur in the problem
you
mention. That a stacked chart implicitly displays a total is something
you
can not avoid, and for that kind of chart I still think normalized
results
is visually the best option.
But on a stacked bars graph, I'm not going to look at individual bars
at all, just at the total: it's actually less convenient than in
"normal bars" to look at the result of a particular benchmark.
I hope I can find guidelines against stacked plots, I have a PhD
colleague reading on how to make graphs.
Best regards
--
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/<http://www.informatik.uni-marburg.de/%7Epgiarrusso/>