[Numpy-discussion] histogramdd memory needs

David Huard david.huard at gmail.com
Mon Feb 4 10:07:02 EST 2008


2008/2/4, Lars Friedrich <lfriedri at imtek.de>:
>
> Hi,
>
> > 2) Is there a way to use another algorithm (at the cost of performance)
> >> > that uses less memory during calculation so that I can generate
> bigger
> >> > histograms?
> >
> >
> > You could work through your array block by block. Simply fix the range
> and
> > generate an histogram for each slice of 100k data and sum them up at the
> > end.
>
> Thank you for your answer.
>
> I sliced the (original) data into blocks. However, when I do this, I
> need at least twice the memory for the whole histogram (one for the
> temporary result and one for accumulating the total result). Assuming my
> histogram has a size of (280**3)*8 = 176 (megabytes) this does not help,
> I think.
>
> What I will try next is to compute smaller parts of the big histogram
> and combine them at the end. (Slice the histogram into blocks) Is it
> this, that you were recommending?


It was badly explained, sorry, but the goal is to reduce memory footprint,
so storing each intermediate result and adding them at the end does not help
indeed. You should update the partial histogram as soon as a block is
computed. I'm sending you a script that does this for 1D histograms. This
comes from the pymc code base. Look at the histogram function in utils.py.

Cheers,

David


Lars
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080204/e06d3b68/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: utils.py
Type: application/octet-stream
Size: 18972 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080204/e06d3b68/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: histogram.f
Type: text/x-fortran
Size: 8479 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080204/e06d3b68/attachment.bin>


More information about the NumPy-Discussion mailing list