![](https://secure.gravatar.com/avatar/f9e0472682baca84a57059e5f7774211.jpg?s=120&d=mm&r=g)
In 0.61.0, when plotting a simple array with error bars, the default color of the error bars is black, instead of being the same as the line/markers color, e.g.:
errorbar([1,2,3,4,5],[3,4,5,6,7],fmt='ro',yerr=[1,1,1,1,1])
I prefer them to be the same, especially since the default color for marker/line is blue and a beginner may be surprised to see the different color. This may be related to my last posting regarding the marker edge color. JC
![](https://secure.gravatar.com/avatar/b378875f1164eb24cbdb810a03af4f77.jpg?s=120&d=mm&r=g)
Any suggestions on an efficient means to bin a 2-d array? REBIN is the IDL function I'm trying to mimic. Binning allows one to combine sets of pixels from one array to form a new array that is smaller by a given factor along each dimension. To nxm bin a 2-dimensional array, one averages (or sums or ?) each nxm block of entries from the input image to form the corresponding entry of the output image. For example, to 2x2 bin a two-dimensional image, one would: average (0,0), (0,1), (1,0), (1,1) to form (0,0) average (0,2), (0,3), (1,2), (1,3) to form (0,1) ... In case it helps, in my immediate case I'm binning a boolean array (a mask) and thus can live with almost any kind of combination. -- Russell
![](https://secure.gravatar.com/avatar/c7976f03fcae7e1199d28d1c20e34647.jpg?s=120&d=mm&r=g)
On Aug 27, 2004, at 8:34 PM, Russell E. Owen wrote:
Note that a boxcar smoothing costs no more than doing the above averaging. So in numarray, you could do the following: from numarray.convolve import boxcar b = boxcar(a, (n,n)) rebinnedarray = b[::n,::n].copy() or something like this (I haven't tried to figure out the correct offset for the slice) where one wants to rebin by a factor of n in both dimensions. We should probably add a built in function to do this. Perry
![](https://secure.gravatar.com/avatar/ad21d909c0ffcff2c377c7ee67aba291.jpg?s=120&d=mm&r=g)
At 9:14 AM -0400 2004-08-30, Perry Greenfield wrote:
Thanks! Great suggestion! I think the polished version (for nxm binning) is: from numarray.convolve import boxcar b = boxcar(a, (n,m), mode='constant', cval=0) rebinnedarray = b[n//2::n,m//2::m].copy() A rebin function would be handy since using boxcar is a bit tricky. -- Russell
![](https://secure.gravatar.com/avatar/ad21d909c0ffcff2c377c7ee67aba291.jpg?s=120&d=mm&r=g)
At 9:06 AM -0700 2004-08-30, Russell E Owen wrote:
I made several mistakes, one of them very serious: the convolve boxcar cannot do the job unless the array size is an exact multiple of the bin factor. The problem is that boxcar starts in the "wrong" place. Here's an example: Problem: solve rebin [0, 1, 2, 3, 4] by 2 to yield: [0.5, 2.5, 4.0] where the last point (value 4) is averaged with next-off-the-end, which we approximate by extending the data (note: my propsed "polished" version messed that up; Perry had it right). Using boxcar almost works:
but oops: the last point is missing!!! What is needed is some way to make the boxcar average start later, so it finishes by averaging 4 plus the next value off the edge of the array, e.g. a hypothetical version of boxcar with a start argument:
nd_image.boxcar_filter has an origin argument that *might* be for this purpose. Unfortunately, it is not documented and my attempt to use it as desired failed. I have no idea if this is a bug in nd_image or a misunderstanding on my part: line 339, in boxcar_filter output_type = output_type) File "/usr/local/lib/python2.3/site-packages/numarray/nd_image/filters.py", line 280, in boxcar_filter1d cval, origin, output_type) RuntimeError: shift not within filter extent So, yes, a rebin function that actually worked would be a real godsend! Meanwhile, any other suggestions? Fortunately in our application we *can* call out to IDL, but it seems a shame to have to do that. -- Russell
![](https://secure.gravatar.com/avatar/c3fbc70c6e7101b4905799649b5572e7.jpg?s=120&d=mm&r=g)
On Mon, 30 Aug 2004, Russell E Owen wrote:
Seems like you got close to the answer. This gives the answer you want:
boxcar_filter(a, (2,), output_type=num.Float32,origin=-1) array([ 0.5, 1.5, 2.5, 3.5, 4. ], type=Float32)
And so this works for rebin:
boxcar_filter(a, (2,), output_type=num.Float32,origin=-1)[::2].copy() array([ 0.5, 2.5, 4. ], type=Float32)
But I still agree with Perry that we ought to provide a built-in rebin function. It is particularly useful for large multi-dimensional arrays where it is wasteful (in both CPU and memory) to create a full-size copy of the array before resampling it down to the desired rebinned size. I appended the .copy() so that at least the big array is not still hanging around in memory (remember that the slice creates a view rather than a copy.) Rick
![](https://secure.gravatar.com/avatar/55f7acf47233a7a98f5eb9dfd0b2d763.jpg?s=120&d=mm&r=g)
[SNIP]
A reasonable facsimile of this should be doable without dropping into C. Something like: def rebin_sum(a, (m, n)): M, N = a.shape a = na.reshape(a, (M/m,m,N/n,n)) return na.sum(na.sum(a, 3), 1) / float(m*n) This does create some temps, but they're smaller than in the boxcar case and it doesn't do all the extra calculation. This doesn't handle the case where a.shape isn't an exact multiple of (m,n). However, I don't think that would be all that hard to implement, if there is a consensus on what should happen then. I can think of at least two different ways this might be done: tacking on values that match the last value as already proposed and tacking on zeros. There may be others as well. It should probably get a boundary condition argument like convolve and friends. Personally, I'd be find rebin a little suprising if it resulted in an average, as all the implementations thus far have done, rather than a simple sum over the stencil. When I think of rebinning I'm thinking of number of occurences per bin, and rebinning should keep the totals occurences the same, not change them by the inverse of the stencil size. My 2.3 cents anyway -tim
![](https://secure.gravatar.com/avatar/ad21d909c0ffcff2c377c7ee67aba291.jpg?s=120&d=mm&r=g)
At 10:56 AM -0700 2004-08-30, Tim Hochberg wrote:
I agree that it would be nice to avoid the extra calculation involved in convolution or boxcar averaging, and the extra temp storage. Your algorithm certainly looks promising, but I'm not sure there's any space saving when the array shape is not an exact multiple of the bin factor. Duplicating the last value is probably the most reasonable alternative for my own applications (imaging). To use your algorithm, I guess one has to increase the array first, creating a new temporary array that is the same as the original except expanded to an even mutiple of the bin factor. In theory one could avoid duplication, but I suspect to do this efficiently one really needs to use C code. I personally have no strong opinion on averaging vs summing. Summing retains precision but risks overflow. Averaging potentially has the opposite advantages, though avoiding overflow is tricky. Note that Nadav Horesh's suggested solution (convolution with a mask of 1s instead of boxcar averaging) computed the sum. -- Russell
![](https://secure.gravatar.com/avatar/55f7acf47233a7a98f5eb9dfd0b2d763.jpg?s=120&d=mm&r=g)
Russell E Owen wrote:
I think you could probably do considerably better than the boxcar code, but it it looks like it would get fairly messy once you start worrying about odd number of bins. It might end up being simpler to implement it C, so that's probably a better idea in the long run.
I really have no strong feelings since I have no use for rebinning at the moment. Back when I did, it would have been for rebinning data from particle detectors. So for instance, you would change the bin size so that you had enough data in each bin that you could attempt to do statistics on it or plot it or whatever. In that domain it would make no sense to average on rebinning. However, I can see how it makes sense for imaging applications. In the absence of any compelling reason to do otherwise, I imagine the thing to do is copy what every one else is doing as long as they're consistent. Do you know what Matlab and friends do? -tim
![](https://secure.gravatar.com/avatar/80473ff660f57aa7f90affadd2240008.jpg?s=120&d=mm&r=g)
On Mon, 2004-08-30 at 12:05, Tim Hochberg wrote:
In the IRAF image tools, one gets summing by setting "fluxconserve=yes". The name of this option is nicely descriptive of what an astronomer would mean by summing as opposed to averaging. Many of the images I work with are ratio images; for example, a solar contrast map. When I reshape a ratio image I want average, not sum. So, I would have to say that I need to have both average and sum available with an option to switch between them. By the way, has anyone else read http://adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=2004SoPh..219....3D&db_key=AST&link_type=ABSTRACT&high=400734345713289 Craig has implemented the algorithm described therein in PDL (Perl Data Language, http://pdl.perl.org), and a Python based implementation would be awfully nice. Hope to see a few of you in Pasadena. -- Stephen Walton <stephen.walton@csun.edu> Dept. of Physics & Astronomy, Cal State Northridge
![](https://secure.gravatar.com/avatar/b378875f1164eb24cbdb810a03af4f77.jpg?s=120&d=mm&r=g)
Any suggestions on an efficient means to bin a 2-d array? REBIN is the IDL function I'm trying to mimic. Binning allows one to combine sets of pixels from one array to form a new array that is smaller by a given factor along each dimension. To nxm bin a 2-dimensional array, one averages (or sums or ?) each nxm block of entries from the input image to form the corresponding entry of the output image. For example, to 2x2 bin a two-dimensional image, one would: average (0,0), (0,1), (1,0), (1,1) to form (0,0) average (0,2), (0,3), (1,2), (1,3) to form (0,1) ... In case it helps, in my immediate case I'm binning a boolean array (a mask) and thus can live with almost any kind of combination. -- Russell
![](https://secure.gravatar.com/avatar/c7976f03fcae7e1199d28d1c20e34647.jpg?s=120&d=mm&r=g)
On Aug 27, 2004, at 8:34 PM, Russell E. Owen wrote:
Note that a boxcar smoothing costs no more than doing the above averaging. So in numarray, you could do the following: from numarray.convolve import boxcar b = boxcar(a, (n,n)) rebinnedarray = b[::n,::n].copy() or something like this (I haven't tried to figure out the correct offset for the slice) where one wants to rebin by a factor of n in both dimensions. We should probably add a built in function to do this. Perry
![](https://secure.gravatar.com/avatar/ad21d909c0ffcff2c377c7ee67aba291.jpg?s=120&d=mm&r=g)
At 9:14 AM -0400 2004-08-30, Perry Greenfield wrote:
Thanks! Great suggestion! I think the polished version (for nxm binning) is: from numarray.convolve import boxcar b = boxcar(a, (n,m), mode='constant', cval=0) rebinnedarray = b[n//2::n,m//2::m].copy() A rebin function would be handy since using boxcar is a bit tricky. -- Russell
![](https://secure.gravatar.com/avatar/ad21d909c0ffcff2c377c7ee67aba291.jpg?s=120&d=mm&r=g)
At 9:06 AM -0700 2004-08-30, Russell E Owen wrote:
I made several mistakes, one of them very serious: the convolve boxcar cannot do the job unless the array size is an exact multiple of the bin factor. The problem is that boxcar starts in the "wrong" place. Here's an example: Problem: solve rebin [0, 1, 2, 3, 4] by 2 to yield: [0.5, 2.5, 4.0] where the last point (value 4) is averaged with next-off-the-end, which we approximate by extending the data (note: my propsed "polished" version messed that up; Perry had it right). Using boxcar almost works:
but oops: the last point is missing!!! What is needed is some way to make the boxcar average start later, so it finishes by averaging 4 plus the next value off the edge of the array, e.g. a hypothetical version of boxcar with a start argument:
nd_image.boxcar_filter has an origin argument that *might* be for this purpose. Unfortunately, it is not documented and my attempt to use it as desired failed. I have no idea if this is a bug in nd_image or a misunderstanding on my part: line 339, in boxcar_filter output_type = output_type) File "/usr/local/lib/python2.3/site-packages/numarray/nd_image/filters.py", line 280, in boxcar_filter1d cval, origin, output_type) RuntimeError: shift not within filter extent So, yes, a rebin function that actually worked would be a real godsend! Meanwhile, any other suggestions? Fortunately in our application we *can* call out to IDL, but it seems a shame to have to do that. -- Russell
![](https://secure.gravatar.com/avatar/c3fbc70c6e7101b4905799649b5572e7.jpg?s=120&d=mm&r=g)
On Mon, 30 Aug 2004, Russell E Owen wrote:
Seems like you got close to the answer. This gives the answer you want:
boxcar_filter(a, (2,), output_type=num.Float32,origin=-1) array([ 0.5, 1.5, 2.5, 3.5, 4. ], type=Float32)
And so this works for rebin:
boxcar_filter(a, (2,), output_type=num.Float32,origin=-1)[::2].copy() array([ 0.5, 2.5, 4. ], type=Float32)
But I still agree with Perry that we ought to provide a built-in rebin function. It is particularly useful for large multi-dimensional arrays where it is wasteful (in both CPU and memory) to create a full-size copy of the array before resampling it down to the desired rebinned size. I appended the .copy() so that at least the big array is not still hanging around in memory (remember that the slice creates a view rather than a copy.) Rick
![](https://secure.gravatar.com/avatar/55f7acf47233a7a98f5eb9dfd0b2d763.jpg?s=120&d=mm&r=g)
[SNIP]
A reasonable facsimile of this should be doable without dropping into C. Something like: def rebin_sum(a, (m, n)): M, N = a.shape a = na.reshape(a, (M/m,m,N/n,n)) return na.sum(na.sum(a, 3), 1) / float(m*n) This does create some temps, but they're smaller than in the boxcar case and it doesn't do all the extra calculation. This doesn't handle the case where a.shape isn't an exact multiple of (m,n). However, I don't think that would be all that hard to implement, if there is a consensus on what should happen then. I can think of at least two different ways this might be done: tacking on values that match the last value as already proposed and tacking on zeros. There may be others as well. It should probably get a boundary condition argument like convolve and friends. Personally, I'd be find rebin a little suprising if it resulted in an average, as all the implementations thus far have done, rather than a simple sum over the stencil. When I think of rebinning I'm thinking of number of occurences per bin, and rebinning should keep the totals occurences the same, not change them by the inverse of the stencil size. My 2.3 cents anyway -tim
![](https://secure.gravatar.com/avatar/ad21d909c0ffcff2c377c7ee67aba291.jpg?s=120&d=mm&r=g)
At 10:56 AM -0700 2004-08-30, Tim Hochberg wrote:
I agree that it would be nice to avoid the extra calculation involved in convolution or boxcar averaging, and the extra temp storage. Your algorithm certainly looks promising, but I'm not sure there's any space saving when the array shape is not an exact multiple of the bin factor. Duplicating the last value is probably the most reasonable alternative for my own applications (imaging). To use your algorithm, I guess one has to increase the array first, creating a new temporary array that is the same as the original except expanded to an even mutiple of the bin factor. In theory one could avoid duplication, but I suspect to do this efficiently one really needs to use C code. I personally have no strong opinion on averaging vs summing. Summing retains precision but risks overflow. Averaging potentially has the opposite advantages, though avoiding overflow is tricky. Note that Nadav Horesh's suggested solution (convolution with a mask of 1s instead of boxcar averaging) computed the sum. -- Russell
![](https://secure.gravatar.com/avatar/55f7acf47233a7a98f5eb9dfd0b2d763.jpg?s=120&d=mm&r=g)
Russell E Owen wrote:
I think you could probably do considerably better than the boxcar code, but it it looks like it would get fairly messy once you start worrying about odd number of bins. It might end up being simpler to implement it C, so that's probably a better idea in the long run.
I really have no strong feelings since I have no use for rebinning at the moment. Back when I did, it would have been for rebinning data from particle detectors. So for instance, you would change the bin size so that you had enough data in each bin that you could attempt to do statistics on it or plot it or whatever. In that domain it would make no sense to average on rebinning. However, I can see how it makes sense for imaging applications. In the absence of any compelling reason to do otherwise, I imagine the thing to do is copy what every one else is doing as long as they're consistent. Do you know what Matlab and friends do? -tim
![](https://secure.gravatar.com/avatar/80473ff660f57aa7f90affadd2240008.jpg?s=120&d=mm&r=g)
On Mon, 2004-08-30 at 12:05, Tim Hochberg wrote:
In the IRAF image tools, one gets summing by setting "fluxconserve=yes". The name of this option is nicely descriptive of what an astronomer would mean by summing as opposed to averaging. Many of the images I work with are ratio images; for example, a solar contrast map. When I reshape a ratio image I want average, not sum. So, I would have to say that I need to have both average and sum available with an option to switch between them. By the way, has anyone else read http://adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=2004SoPh..219....3D&db_key=AST&link_type=ABSTRACT&high=400734345713289 Craig has implemented the algorithm described therein in PDL (Perl Data Language, http://pdl.perl.org), and a Python based implementation would be awfully nice. Hope to see a few of you in Pasadena. -- Stephen Walton <stephen.walton@csun.edu> Dept. of Physics & Astronomy, Cal State Northridge
participants (7)
-
Jin-chung Hsu
-
Perry Greenfield
-
Rick White
-
Russell E Owen
-
Russell E. Owen
-
Stephen Walton
-
Tim Hochberg