[pypy-dev] memory recording... Re: speed.pypy.org quick update

Fri Mar 12 14:52:02 CET 2010

Hi again,

trying to do some research on ways to record memory usage in an X-platform
way...

keeping my notes here:

http://renesd.blogspot.com/2010/03/memory-usage-of-processes-from-python.html

So far people have come up with these two useful projects so far:
    http://code.google.com/p/psutil/
    http://code.google.com/p/pympler/

I think psutil will have most info needed to construct a decent memory
recording module for benchmarks.  However, it includes C code, so will
probably have to rip some of the memory parts out, and maybe reimplement
with ctypes.

cu,

On Fri, Mar 12, 2010 at 10:49 AM, René Dudfield <renesd at gmail.com> wrote:

> btw, for python memory usage on linux
> /proc/PID/status
>
> Here is some code for linux...
>
> wget http://rene.f0o.com/~rene/stuff/memory_usage.py<http://rene.f0o.com/%7Erene/stuff/memory_usage.py>
>
> >>> import memory_usage
> >>> bytes_of_resident_memory = memory_usage.resident()
>
>
> Should be easy enough to add that to benchmarks at the start and end?
> Maybe calling it in the middle would be a little harder... but not too hard.
>
>
> TODO: Would need to be updated for other platforms, and support measuring
> child processes, tests, and code cleanup :)
>
> cu,
>
>
>
>
> On Thu, Mar 11, 2010 at 12:32 AM, Maciej Fijalkowski <fijall at gmail.com>wrote:
>
>> Hey.
>>
>> I'll answer questions that are relevant to benchmarks themselves and
>> not running.
>>
>> On Wed, Mar 10, 2010 at 4:45 PM, Bengt Richter <bokr at oz.net> wrote:
>> > On 03/10/2010 12:14 PM Miquel Torres wrote:
>> >> Hi!
>> >>
>> >> I wanted to explain a couple of things about the speed website:
>> >>
>> >> - New feature: the Timeline view now defaults to a plot grid, showing
>> >> all benchmarks at the same time. It was a feature request made more
>> >> than once, so depending on personal tastes, you can bookmark either
>> >> /overview/ or /timeline/. Thanks go to nsf for helping with the
>> >> implementation.
>> >> - The code has now moved to github as Codespeed, a benchmark
>> >> visualization framework (http://github.com/tobami/codespeed)
>> >> - I have updated speed.pypy.org with version 0.3. Much of the work has
>> >> been under the hood to make it feasible for other projects to use
>> >> codespeed as a framework.
>> >>
>> >> For those interested in further development you can go to the releases
>> >> wiki (still a work in progress):
>> >> http://wiki.github.com/tobami/codespeed/releases
>> >>
>> >> Next in the line are some DB changes to be able to save standard
>> >> deviation data and the like. Long term goals besides world domination
>> >> are integration with buildbot and similarly unrealistic things.
>> >> Feedback is always welcome.
>> >
>> > Nice looking stuff. But a couple comments:
>> >
>> > 1. IMO standard deviation is too often worse than useless, since it
>> hides
>> >    the true nature of the distribution. I think the assumption of
>> normalcy
>> >    is highly suspect for benchmark timings, and pruning may hide
>> interesting clusters.
>> >
>> >    I prefer to look at scattergrams, where things like clustering and
>> correlations
>> >    are easily apparent to the eye, as well as the amount of data
>> (assuming a good
>> >    mapping of density to visuals).
>>
>> That's true. In general a benchmark run over time is a period of
>> warmup, when JIT compiles assembler followed by thing that can be
>> described by average and std devation. Personally I would like to have
>> those 3 measures separated, but didn't implement that yet (it has also
>> some interesting statistical questions involved). Std deviation is
>> useful to get whether a difference was meaningful of certain checkin
>> or just noise.
>>
>> >
>> > 2. IMO benchmark timings are like travel times, comparing different
>> vehicles.
>> >    (pypy with jit being a vehicle capable of dynamic self-modification
>> ;-)
>> >    E.g., which part of travel from Stockholm to Paris would you
>> concentrate
>> >    on improving to improve the overall result? How about travel from
>> Brussels to Paris?
>> >    Or Paris to Sydney? ;-P Different things come into play in different
>> benchmarks/trips.
>> >    A Porsche Turbo and a 2CV will both have to wait for a ferry, if
>> that's part of the trip.
>> >
>> >    IOW, it would be nice to see total time broken down somehow, to see
>> what's really
>> >    happening.
>>
>> I can't agree more with that. We already do split time when we perform
>> benchmarks by hand, but they're not yet integrated into the whole
>> nightly run. Total time is what users see though, that's why our
>> public site is focused on that. I want more information available, but
>> we have only limited amount of manpower and miquel already did quite
>> amazing job in my opinion :-) We'll probably go into more details.
>>
>> The part we want to focus on past-release is speeding up certain parts
>> of tracing as well as limiting it's GC pressure. As you can see, the
>> split would be very useful for our development.
>>
>> >
>> >    Don't get me wrong, the total times are certainly useful indicators
>> of progress
>> >    (which has been amazing).
>> >
>> > 3. Speed is ds/dt and you are showing the integral of dt/ds over the
>> trip distance to get time.
>> >    A 25% improvement in total time is not a 25% improvement in speed.
>> I.e., (if you define
>> >    improvement as a percentage change in a desired direction), for e.g.
>> 25%:
>> >    distance/(0.75*time) != 1.25*(distance/time).
>> >
>> >    IMO 'speed' (the implication to me in the name speed.pypy.org) would
>> be benchmarks/time
>> >    more appropriately than time/benchmark.
>> >
>> >    Both measures are useful, but time percentages are easy to
>> mis{use,construe} ;-)
>>
>> That's correct.
>>
>> Benchmarks are in general very easy to lie about and they're by
>> definition flawed. That's why I always include raw data when I publish
>> stuff on the blog, so people can work on it themselves.
>>
>> >
>> > 4. Is there any memory footprint data?
>> >
>>
>> No. Memory measurment is hard and it's even less useful without
>> breaking down. Those particular benchmarks are not very good basis for
>> memory measurment - in case of pypy you would mostly observe the
>> default allocated memory (which is roughly 10M for the interpreter +
>> 16M for semispace GC + cache for nursery).
>>
>> Also our GC is of a kind that it can run faster if you give it more
>> memory (not that we use this feature, but it's possible).
>>
>> Cheers,
>> fijal
>> _______________________________________________
>> pypy-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/pypy-dev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20100312/a13891e0/attachment.html>