[pypy-dev] speed.pypy.org quick update

René Dudfield renesd at gmail.com
Fri Mar 12 11:49:54 CET 2010


btw, for python memory usage on linux
/proc/PID/status

Here is some code for linux...

wget http://rene.f0o.com/~rene/stuff/memory_usage.py

>>> import memory_usage
>>> bytes_of_resident_memory = memory_usage.resident()


Should be easy enough to add that to benchmarks at the start and end?  Maybe
calling it in the middle would be a little harder... but not too hard.


TODO: Would need to be updated for other platforms, and support measuring
child processes, tests, and code cleanup :)

cu,



On Thu, Mar 11, 2010 at 12:32 AM, Maciej Fijalkowski <fijall at gmail.com>wrote:

> Hey.
>
> I'll answer questions that are relevant to benchmarks themselves and
> not running.
>
> On Wed, Mar 10, 2010 at 4:45 PM, Bengt Richter <bokr at oz.net> wrote:
> > On 03/10/2010 12:14 PM Miquel Torres wrote:
> >> Hi!
> >>
> >> I wanted to explain a couple of things about the speed website:
> >>
> >> - New feature: the Timeline view now defaults to a plot grid, showing
> >> all benchmarks at the same time. It was a feature request made more
> >> than once, so depending on personal tastes, you can bookmark either
> >> /overview/ or /timeline/. Thanks go to nsf for helping with the
> >> implementation.
> >> - The code has now moved to github as Codespeed, a benchmark
> >> visualization framework (http://github.com/tobami/codespeed)
> >> - I have updated speed.pypy.org with version 0.3. Much of the work has
> >> been under the hood to make it feasible for other projects to use
> >> codespeed as a framework.
> >>
> >> For those interested in further development you can go to the releases
> >> wiki (still a work in progress):
> >> http://wiki.github.com/tobami/codespeed/releases
> >>
> >> Next in the line are some DB changes to be able to save standard
> >> deviation data and the like. Long term goals besides world domination
> >> are integration with buildbot and similarly unrealistic things.
> >> Feedback is always welcome.
> >
> > Nice looking stuff. But a couple comments:
> >
> > 1. IMO standard deviation is too often worse than useless, since it hides
> >    the true nature of the distribution. I think the assumption of
> normalcy
> >    is highly suspect for benchmark timings, and pruning may hide
> interesting clusters.
> >
> >    I prefer to look at scattergrams, where things like clustering and
> correlations
> >    are easily apparent to the eye, as well as the amount of data
> (assuming a good
> >    mapping of density to visuals).
>
> That's true. In general a benchmark run over time is a period of
> warmup, when JIT compiles assembler followed by thing that can be
> described by average and std devation. Personally I would like to have
> those 3 measures separated, but didn't implement that yet (it has also
> some interesting statistical questions involved). Std deviation is
> useful to get whether a difference was meaningful of certain checkin
> or just noise.
>
> >
> > 2. IMO benchmark timings are like travel times, comparing different
> vehicles.
> >    (pypy with jit being a vehicle capable of dynamic self-modification
> ;-)
> >    E.g., which part of travel from Stockholm to Paris would you
> concentrate
> >    on improving to improve the overall result? How about travel from
> Brussels to Paris?
> >    Or Paris to Sydney? ;-P Different things come into play in different
> benchmarks/trips.
> >    A Porsche Turbo and a 2CV will both have to wait for a ferry, if
> that's part of the trip.
> >
> >    IOW, it would be nice to see total time broken down somehow, to see
> what's really
> >    happening.
>
> I can't agree more with that. We already do split time when we perform
> benchmarks by hand, but they're not yet integrated into the whole
> nightly run. Total time is what users see though, that's why our
> public site is focused on that. I want more information available, but
> we have only limited amount of manpower and miquel already did quite
> amazing job in my opinion :-) We'll probably go into more details.
>
> The part we want to focus on past-release is speeding up certain parts
> of tracing as well as limiting it's GC pressure. As you can see, the
> split would be very useful for our development.
>
> >
> >    Don't get me wrong, the total times are certainly useful indicators of
> progress
> >    (which has been amazing).
> >
> > 3. Speed is ds/dt and you are showing the integral of dt/ds over the trip
> distance to get time.
> >    A 25% improvement in total time is not a 25% improvement in speed.
> I.e., (if you define
> >    improvement as a percentage change in a desired direction), for e.g.
> 25%:
> >    distance/(0.75*time) != 1.25*(distance/time).
> >
> >    IMO 'speed' (the implication to me in the name speed.pypy.org) would
> be benchmarks/time
> >    more appropriately than time/benchmark.
> >
> >    Both measures are useful, but time percentages are easy to
> mis{use,construe} ;-)
>
> That's correct.
>
> Benchmarks are in general very easy to lie about and they're by
> definition flawed. That's why I always include raw data when I publish
> stuff on the blog, so people can work on it themselves.
>
> >
> > 4. Is there any memory footprint data?
> >
>
> No. Memory measurment is hard and it's even less useful without
> breaking down. Those particular benchmarks are not very good basis for
> memory measurment - in case of pypy you would mostly observe the
> default allocated memory (which is roughly 10M for the interpreter +
> 16M for semispace GC + cache for nursery).
>
> Also our GC is of a kind that it can run faster if you give it more
> memory (not that we use this feature, but it's possible).
>
> Cheers,
> fijal
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20100312/b5d4f914/attachment.html>


More information about the Pypy-dev mailing list