[pypy-dev] memory recording... Re: speed.pypy.org quick update

Miquel Torres tobami at googlemail.com
Sun Mar 14 14:14:25 CET 2010


Yeah, we really need to sit down and find an acceptable way to measure and
save memory consumption.data.


2010/3/12 René Dudfield <renesd at gmail.com>

> Hi again,
>
> trying to do some research on ways to record memory usage in an X-platform
> way...
>
> keeping my notes here:
>
> http://renesd.blogspot.com/2010/03/memory-usage-of-processes-from-python.html
>
> So far people have come up with these two useful projects so far:
>     http://code.google.com/p/psutil/
>     http://code.google.com/p/pympler/
>
> I think psutil will have most info needed to construct a decent memory
> recording module for benchmarks.  However, it includes C code, so will
> probably have to rip some of the memory parts out, and maybe reimplement
> with ctypes.
>
>
> cu,
>
>
>
>
> On Fri, Mar 12, 2010 at 10:49 AM, René Dudfield <renesd at gmail.com> wrote:
>
>> btw, for python memory usage on linux
>> /proc/PID/status
>>
>> Here is some code for linux...
>>
>> wget http://rene.f0o.com/~rene/stuff/memory_usage.py<http://rene.f0o.com/%7Erene/stuff/memory_usage.py>
>>
>> >>> import memory_usage
>> >>> bytes_of_resident_memory = memory_usage.resident()
>>
>>
>> Should be easy enough to add that to benchmarks at the start and end?
>> Maybe calling it in the middle would be a little harder... but not too hard.
>>
>>
>> TODO: Would need to be updated for other platforms, and support measuring
>> child processes, tests, and code cleanup :)
>>
>> cu,
>>
>>
>>
>>
>> On Thu, Mar 11, 2010 at 12:32 AM, Maciej Fijalkowski <fijall at gmail.com>wrote:
>>
>>> Hey.
>>>
>>> I'll answer questions that are relevant to benchmarks themselves and
>>> not running.
>>>
>>> On Wed, Mar 10, 2010 at 4:45 PM, Bengt Richter <bokr at oz.net> wrote:
>>> > On 03/10/2010 12:14 PM Miquel Torres wrote:
>>> >> Hi!
>>> >>
>>> >> I wanted to explain a couple of things about the speed website:
>>> >>
>>> >> - New feature: the Timeline view now defaults to a plot grid, showing
>>> >> all benchmarks at the same time. It was a feature request made more
>>> >> than once, so depending on personal tastes, you can bookmark either
>>> >> /overview/ or /timeline/. Thanks go to nsf for helping with the
>>> >> implementation.
>>> >> - The code has now moved to github as Codespeed, a benchmark
>>> >> visualization framework (http://github.com/tobami/codespeed)
>>> >> - I have updated speed.pypy.org with version 0.3. Much of the work
>>> has
>>> >> been under the hood to make it feasible for other projects to use
>>> >> codespeed as a framework.
>>> >>
>>> >> For those interested in further development you can go to the releases
>>> >> wiki (still a work in progress):
>>> >> http://wiki.github.com/tobami/codespeed/releases
>>> >>
>>> >> Next in the line are some DB changes to be able to save standard
>>> >> deviation data and the like. Long term goals besides world domination
>>> >> are integration with buildbot and similarly unrealistic things.
>>> >> Feedback is always welcome.
>>> >
>>> > Nice looking stuff. But a couple comments:
>>> >
>>> > 1. IMO standard deviation is too often worse than useless, since it
>>> hides
>>> >    the true nature of the distribution. I think the assumption of
>>> normalcy
>>> >    is highly suspect for benchmark timings, and pruning may hide
>>> interesting clusters.
>>> >
>>> >    I prefer to look at scattergrams, where things like clustering and
>>> correlations
>>> >    are easily apparent to the eye, as well as the amount of data
>>> (assuming a good
>>> >    mapping of density to visuals).
>>>
>>> That's true. In general a benchmark run over time is a period of
>>> warmup, when JIT compiles assembler followed by thing that can be
>>> described by average and std devation. Personally I would like to have
>>> those 3 measures separated, but didn't implement that yet (it has also
>>> some interesting statistical questions involved). Std deviation is
>>> useful to get whether a difference was meaningful of certain checkin
>>> or just noise.
>>>
>>> >
>>> > 2. IMO benchmark timings are like travel times, comparing different
>>> vehicles.
>>> >    (pypy with jit being a vehicle capable of dynamic self-modification
>>> ;-)
>>> >    E.g., which part of travel from Stockholm to Paris would you
>>> concentrate
>>> >    on improving to improve the overall result? How about travel from
>>> Brussels to Paris?
>>> >    Or Paris to Sydney? ;-P Different things come into play in different
>>> benchmarks/trips.
>>> >    A Porsche Turbo and a 2CV will both have to wait for a ferry, if
>>> that's part of the trip.
>>> >
>>> >    IOW, it would be nice to see total time broken down somehow, to see
>>> what's really
>>> >    happening.
>>>
>>> I can't agree more with that. We already do split time when we perform
>>> benchmarks by hand, but they're not yet integrated into the whole
>>> nightly run. Total time is what users see though, that's why our
>>> public site is focused on that. I want more information available, but
>>> we have only limited amount of manpower and miquel already did quite
>>> amazing job in my opinion :-) We'll probably go into more details.
>>>
>>> The part we want to focus on past-release is speeding up certain parts
>>> of tracing as well as limiting it's GC pressure. As you can see, the
>>> split would be very useful for our development.
>>>
>>> >
>>> >    Don't get me wrong, the total times are certainly useful indicators
>>> of progress
>>> >    (which has been amazing).
>>> >
>>> > 3. Speed is ds/dt and you are showing the integral of dt/ds over the
>>> trip distance to get time.
>>> >    A 25% improvement in total time is not a 25% improvement in speed.
>>> I.e., (if you define
>>> >    improvement as a percentage change in a desired direction), for e.g.
>>> 25%:
>>> >    distance/(0.75*time) != 1.25*(distance/time).
>>> >
>>> >    IMO 'speed' (the implication to me in the name speed.pypy.org)
>>> would be benchmarks/time
>>> >    more appropriately than time/benchmark.
>>> >
>>> >    Both measures are useful, but time percentages are easy to
>>> mis{use,construe} ;-)
>>>
>>> That's correct.
>>>
>>> Benchmarks are in general very easy to lie about and they're by
>>> definition flawed. That's why I always include raw data when I publish
>>> stuff on the blog, so people can work on it themselves.
>>>
>>> >
>>> > 4. Is there any memory footprint data?
>>> >
>>>
>>> No. Memory measurment is hard and it's even less useful without
>>> breaking down. Those particular benchmarks are not very good basis for
>>> memory measurment - in case of pypy you would mostly observe the
>>> default allocated memory (which is roughly 10M for the interpreter +
>>> 16M for semispace GC + cache for nursery).
>>>
>>> Also our GC is of a kind that it can run faster if you give it more
>>> memory (not that we use this feature, but it's possible).
>>>
>>> Cheers,
>>> fijal
>>> _______________________________________________
>>> pypy-dev at codespeak.net
>>> http://codespeak.net/mailman/listinfo/pypy-dev
>>>
>>
>>
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20100314/167ff28e/attachment.html>


More information about the Pypy-dev mailing list