[Numpy-discussion] Rebinning numpy array
Johannes Bauer
dfnsonfsduifb at gmx.de
Sun Nov 13 11:04:07 EST 2011
Hi group,
I have a rather simple problem, or so it would seem. However I cannot
seem to find the right solution. Here's the problem:
A Geiger counter measures counts in distinct time intervals. The time
intervals are not of constant length. Imaging for example that the
counter would always create a table entry when the counts reach 10. Then
we would have the following bins (made-up data for illustration):
Seconds Counts Len CPS
0 - 44 10 44 0.23
44 - 120 10 76 0.13
120 - 140 10 20 0.5
140 - 200 10 60 0.16
So we have n bins (in this example 4), but they're not equidistant. I
want to rebin samples to make them equidistant. For example, I would
like to rebin into 5 bins of 40 seconds time each. Then the rebinned
example (I calculate by hand so this might contain errors):
0-40 9.09
40-80 5.65
80-120 5.26
120-160 13.33
160-200 6.66
That means, if a destination bin completely overlaps a source bin, its
complete value is taken. If it overlaps partially, linear interpolation
of bin sizes should be used.
It is very important that the overall count amount stays the same (in
this case 40, so my numbers seem to be correct, I checked that). In this
example I increased the bin size, but usually I will want to decrease
bin size (even dramatically).
Now my pathetic attempts look something like this:
interpolation_points = 4000
xpts = [ time.mktime(x.timetuple()) for x in self.getx() ]
interpolatedx = numpy.linspace(xpts[0], xpts[-1], interpolation_points)
interpolatedy = numpy.interp(interpolatedx, xpts, self.gety())
self._xreformatted = [ datetime.datetime.fromtimestamp(x) for x in
interpolatedx ]
self._yreformatted = interpolatedy
This works somewhat, however I see artifacts depending on the
destination sample size: for example when I have a spike in the sample
input and reduce the number of interpolation points (i.e. increase
destination bin size) slowly, the spike will get smaller and smaller
(expected behaviour). After some amount of increasing, the spike however
will "magically" reappear. I believe this to be an interpolation artifact.
Is there some standard way to get from a non-uniformally distributed bin
distribution to a unifomally distributed bin distribution of arbitrary
bin width?
Best regards,
Joe
More information about the NumPy-Discussion
mailing list