efficient 3d histogram creation
in my endless pursuit of perfomance, i'm searching for a quick way to create a 3d histogram from a 3d rgb image. Here is what I have so far for a (16,16,16) 3d histogram: def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() (i, j) = imgarray.shape[0:2] temp = (temp - temp % 16) / 16 for a in range(i): for b in range(j): (b1, b2, b3) = temp[a, b, :] histarray[b1, b2, b3] += 1 return histarray this works, but takes about 4 seconds for a 640x480 image. I tried doing the inverse of my previous post, namely replacing the nested for loop with: histarray[temp[:,:,0], temp[:,:,1], temp[:,:,2]] += 1 but that doesn't work for whatever reason. It gives me number, but they're incorrect. Any ideas? Chris
Stefan,
I'm not sure:
the docs say the input has to be:
sample : array_like
Data to histogram passed as a sequence of D arrays of length N, or
as an (N,D) array
i have an (N,M,D) array and not sure how to get it to conform to input
required, mainly because I don't understand what it's asking.
Chris
2009/5/3 Stéfan van der Walt
Hi Chris
2009/5/4 Chris Colbert
: in my endless pursuit of perfomance, i'm searching for a quick way to create a 3d histogram from a 3d rgb image.
Does histogramdd do what you want?
Regards Stéfan _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Hi Chris
2009/5/4 Chris Colbert
I'm not sure:
the docs say the input has to be: sample : array_like Data to histogram passed as a sequence of D arrays of length N, or as an (N,D) array
i have an (N,M,D) array and not sure how to get it to conform to input required, mainly because I don't understand what it's asking.
Try count, bins = np.histogramdd(x.reshape((-1,3)), bins=16) Regards Stéfan
On Sun, May 3, 2009 at 8:15 PM, Chris Colbert
in my endless pursuit of perfomance, i'm searching for a quick way to create a 3d histogram from a 3d rgb image.
Here is what I have so far for a (16,16,16) 3d histogram:
def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() (i, j) = imgarray.shape[0:2] temp = (temp - temp % 16) / 16 for a in range(i): for b in range(j): (b1, b2, b3) = temp[a, b, :] histarray[b1, b2, b3] += 1 return histarray
this works, but takes about 4 seconds for a 640x480 image.
I tried doing the inverse of my previous post, namely replacing the nested for loop with: histarray[temp[:,:,0], temp[:,:,1], temp[:,:,2]] += 1
but that doesn't work for whatever reason. It gives me number, but they're incorrect.
Any ideas?
I'm not sure what exactly you need, but did you look at np.histogramdd ? reading the help file, this might work numpy.histogramdd(temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel(), bins=16) but I never used histogramdd. also looking at the source of numpy is often very instructive, lots of good tricks to find in there: np.source(np.histogramdd). Josef
this actually sort of worked. Thanks for putting me on the right track.
Here is what I ended up with.
this is what I ended up with:
def hist3d(imgarray):
histarray = N.zeros((16, 16, 16))
temp = imgarray.copy()
bins = N.arange(0, 257, 16)
histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(),
temp[:,:,2].ravel()), bins=(bins, bins, bins))[0]
return histarray
this creates a 3d histogram of rgb image values in the range 0,255 using 16
bins per component color.
on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for
loop.
not quite framerate, but good enough for prototyping.
Thanks!
Chris
On Sun, May 3, 2009 at 8:36 PM,
in my endless pursuit of perfomance, i'm searching for a quick way to create a 3d histogram from a 3d rgb image.
Here is what I have so far for a (16,16,16) 3d histogram:
def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() (i, j) = imgarray.shape[0:2] temp = (temp - temp % 16) / 16 for a in range(i): for b in range(j): (b1, b2, b3) = temp[a, b, :] histarray[b1, b2, b3] += 1 return histarray
this works, but takes about 4 seconds for a 640x480 image.
I tried doing the inverse of my previous post, namely replacing the nested for loop with: histarray[temp[:,:,0], temp[:,:,1], temp[:,:,2]] += 1
but that doesn't work for whatever reason. It gives me number, but
On Sun, May 3, 2009 at 8:15 PM, Chris Colbert
wrote: they're incorrect.
Any ideas?
I'm not sure what exactly you need, but did you look at np.histogramdd ?
reading the help file, this might work
numpy.histogramdd(temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel(), bins=16)
but I never used histogramdd.
also looking at the source of numpy is often very instructive, lots of good tricks to find in there: np.source(np.histogramdd).
Josef _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, May 4, 2009 at 12:31 AM, Chris Colbert
this actually sort of worked. Thanks for putting me on the right track.
Here is what I ended up with.
this is what I ended up with:
def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() bins = N.arange(0, 257, 16) histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] return histarray
this creates a 3d histogram of rgb image values in the range 0,255 using 16 bins per component color.
on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for loop.
not quite framerate, but good enough for prototyping.
I don't think your copy to temp is necessary, and use reshape(-1,3) as in the example of Stefan, which will avoid copying the array 3 times. If you need to gain some more speed, then rewriting histogramdd and removing some of the unnecessary checks and calculations looks possible. Josef
On Mon, May 4, 2009 at 7:00 AM,
On Mon, May 4, 2009 at 12:31 AM, Chris Colbert
wrote: this actually sort of worked. Thanks for putting me on the right track.
Here is what I ended up with.
this is what I ended up with:
def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() bins = N.arange(0, 257, 16) histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] return histarray
this creates a 3d histogram of rgb image values in the range 0,255 using 16 bins per component color.
on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for loop.
not quite framerate, but good enough for prototyping.
I don't think your copy to temp is necessary, and use reshape(-1,3) as in the example of Stefan, which will avoid copying the array 3 times.
If you need to gain some more speed, then rewriting histogramdd and removing some of the unnecessary checks and calculations looks possible.
Indeed, the strategy used in the histogram function is faster than the one used in the histogramdd case, so porting one to the other should speed things up. David
Josef _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
i'll take a look at them over the next few days and see what i can hack out.
Chris
On Mon, May 4, 2009 at 3:18 PM, David Huard
On Mon, May 4, 2009 at 7:00 AM,
wrote: On Mon, May 4, 2009 at 12:31 AM, Chris Colbert
wrote: this actually sort of worked. Thanks for putting me on the right track.
Here is what I ended up with.
this is what I ended up with:
def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() bins = N.arange(0, 257, 16) histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] return histarray
this creates a 3d histogram of rgb image values in the range 0,255 using 16 bins per component color.
on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for loop.
not quite framerate, but good enough for prototyping.
I don't think your copy to temp is necessary, and use reshape(-1,3) as in the example of Stefan, which will avoid copying the array 3 times.
If you need to gain some more speed, then rewriting histogramdd and removing some of the unnecessary checks and calculations looks possible.
Indeed, the strategy used in the histogram function is faster than the one used in the histogramdd case, so porting one to the other should speed things up.
David
Josef _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, May 4, 2009 at 4:00 PM, Chris Colbert
i'll take a look at them over the next few days and see what i can hack out.
Chris
On Mon, May 4, 2009 at 3:18 PM, David Huard
wrote: On Mon, May 4, 2009 at 7:00 AM,
wrote: On Mon, May 4, 2009 at 12:31 AM, Chris Colbert
wrote: this actually sort of worked. Thanks for putting me on the right track.
Here is what I ended up with.
this is what I ended up with:
def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() bins = N.arange(0, 257, 16) histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] return histarray
this creates a 3d histogram of rgb image values in the range 0,255 using 16 bins per component color.
on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for loop.
not quite framerate, but good enough for prototyping.
I don't think your copy to temp is necessary, and use reshape(-1,3) as in the example of Stefan, which will avoid copying the array 3 times.
If you need to gain some more speed, then rewriting histogramdd and removing some of the unnecessary checks and calculations looks possible.
Indeed, the strategy used in the histogram function is faster than the one used in the histogramdd case, so porting one to the other should speed things up.
David
is searchsorted faster than digitize and bincount ? Using the idea of histogramdd, I get a bit below a tenth of a second, my best for this problem is below. I was trying for a while what the fastest way is to convert a two dimensional array into a one dimensional index for bincount. I found that using the return index of unique1d is very slow compared to numeric index calculation. Josef example timed for: nobs = 307200 nbins = 16 factors = np.random.randint(256,size=(nobs,3)).copy() factors2 = factors.reshape(-1,480,3).copy() def hist3(factorsin, nbins): if factorsin.ndim != 2: factors = factorsin.reshape(-1,factorsin.shape[-1]) else: factors = factorsin N, D = factors.shape darr = np.empty(factors.T.shape, dtype=int) nele = np.max(factors)+1 bins = np.arange(0, nele, nele/nbins) bins[-1] += 1 for i in range(D): darr[i] = np.digitize(factors[:,i],bins) - 1 #add weighted rows darrind = darr[D-1] for i in range(D-1): darrind += darr[i]*nbins**(D-i-1) return np.bincount(darrind) # return flat not reshaped
On Mon, May 4, 2009 at 4:18 PM,
i'll take a look at them over the next few days and see what i can hack out.
Chris
On Mon, May 4, 2009 at 3:18 PM, David Huard
wrote: On Mon, May 4, 2009 at 7:00 AM,
wrote: On Mon, May 4, 2009 at 12:31 AM, Chris Colbert
wrote: this actually sort of worked. Thanks for putting me on the right
On Mon, May 4, 2009 at 4:00 PM, Chris Colbert
wrote: track. Here is what I ended up with.
this is what I ended up with:
def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() bins = N.arange(0, 257, 16) histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] return histarray
this creates a 3d histogram of rgb image values in the range 0,255 using 16 bins per component color.
on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a
for
loop.
not quite framerate, but good enough for prototyping.
I don't think your copy to temp is necessary, and use reshape(-1,3) as in the example of Stefan, which will avoid copying the array 3 times.
If you need to gain some more speed, then rewriting histogramdd and removing some of the unnecessary checks and calculations looks possible.
Indeed, the strategy used in the histogram function is faster than the one used in the histogramdd case, so porting one to the other should speed things up.
David
is searchsorted faster than digitize and bincount ?
That depends on the number of bins and whether or not the bin width is uniform. A 1D benchmark I did a while ago showed that if the bin width is uniform, then the best strategy is to create a counter initialized to 0, loop through the data, compute i = (x-bin0) /binwidth and increment counter i by 1 (or by the weight of the data). If the bins are non uniform, then for nbin > 30 you'd better use searchsort, and digitize otherwise. For those interested in speeding up histogram code, I recommend reading a thread started by Cameron Walsh on the 12/12/06 named "Histograms of extremely large data sets" Code and benchmarks were posted. Chris, if your bins all have the same width, then you can certainly write an histogramdd routine that is way faster by using the indexing trick instead of digitize or searchsort. Cheers, David
Using the idea of histogramdd, I get a bit below a tenth of a second, my best for this problem is below. I was trying for a while what the fastest way is to convert a two dimensional array into a one dimensional index for bincount. I found that using the return index of unique1d is very slow compared to numeric index calculation.
Josef
example timed for: nobs = 307200 nbins = 16 factors = np.random.randint(256,size=(nobs,3)).copy() factors2 = factors.reshape(-1,480,3).copy()
def hist3(factorsin, nbins): if factorsin.ndim != 2: factors = factorsin.reshape(-1,factorsin.shape[-1]) else: factors = factorsin N, D = factors.shape darr = np.empty(factors.T.shape, dtype=int) nele = np.max(factors)+1 bins = np.arange(0, nele, nele/nbins) bins[-1] += 1 for i in range(D): darr[i] = np.digitize(factors[:,i],bins) - 1
#add weighted rows darrind = darr[D-1] for i in range(D-1): darrind += darr[i]*nbins**(D-i-1) return np.bincount(darrind) # return flat not reshaped
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I decided to hold myself over until being able to take a hard look at the
numpy histogramdd code:
Here is a quick thing a put together in cython. It's a 40x speedup over
histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630, 3)
array, this executed in 0.005 seconds on my machine.
This only works for arrays with uint8 data types having dimensions (x, y, 3)
(common image format). The return array is a (16, 16, 16) equal width bin
histogram of the input.
If anyone wants the cython C-output, let me know and I will email it to you.
If there is interest, I will extend this for different size bins and aliases
for different data types.
Chris
import numpy as np
cimport numpy as np
DTYPE = np.uint8
DTYPE32 = np.int
ctypedef np.uint8_t DTYPE_t
ctypedef np.int_t DTYPE_t32
def hist3d(np.ndarray[DTYPE_t, ndim=3] img):
cdef int x = img.shape[0]
cdef int y = img.shape[1]
cdef int z = img.shape[2]
cdef int addx
cdef int addy
cdef int addz
cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16],
dtype=DTYPE32)
cdef int i, j, v0, v1, v2
for i in range(x):
for j in range(y):
v0 = img[i, j, 0]
v1 = img[i, j, 1]
v2 = img[i, j, 2]
addx = (v0 - (v0 % 16)) / 16
addy = (v1 - (v1 % 16)) / 16
addz = (v2 - (v2 % 16)) / 16
out[addx, addy, addz] += 1
return out
On Tue, May 5, 2009 at 9:46 AM, David Huard
On Mon, May 4, 2009 at 4:18 PM,
wrote: i'll take a look at them over the next few days and see what i can hack out.
Chris
On Mon, May 4, 2009 at 3:18 PM, David Huard
wrote: On Mon, May 4, 2009 at 7:00 AM,
wrote: On Mon, May 4, 2009 at 12:31 AM, Chris Colbert
wrote: this actually sort of worked. Thanks for putting me on the right
On Mon, May 4, 2009 at 4:00 PM, Chris Colbert
wrote: track. Here is what I ended up with.
this is what I ended up with:
def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() bins = N.arange(0, 257, 16) histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] return histarray
this creates a 3d histogram of rgb image values in the range 0,255 using 16 bins per component color.
on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a
for
loop.
not quite framerate, but good enough for prototyping.
I don't think your copy to temp is necessary, and use reshape(-1,3) as in the example of Stefan, which will avoid copying the array 3 times.
If you need to gain some more speed, then rewriting histogramdd and removing some of the unnecessary checks and calculations looks possible.
Indeed, the strategy used in the histogram function is faster than the one used in the histogramdd case, so porting one to the other should speed things up.
David
is searchsorted faster than digitize and bincount ?
That depends on the number of bins and whether or not the bin width is uniform. A 1D benchmark I did a while ago showed that if the bin width is uniform, then the best strategy is to create a counter initialized to 0, loop through the data, compute i = (x-bin0) /binwidth and increment counter i by 1 (or by the weight of the data). If the bins are non uniform, then for nbin > 30 you'd better use searchsort, and digitize otherwise.
For those interested in speeding up histogram code, I recommend reading a thread started by Cameron Walsh on the 12/12/06 named "Histograms of extremely large data sets" Code and benchmarks were posted.
Chris, if your bins all have the same width, then you can certainly write an histogramdd routine that is way faster by using the indexing trick instead of digitize or searchsort.
Cheers,
David
Using the idea of histogramdd, I get a bit below a tenth of a second, my best for this problem is below. I was trying for a while what the fastest way is to convert a two dimensional array into a one dimensional index for bincount. I found that using the return index of unique1d is very slow compared to numeric index calculation.
Josef
example timed for: nobs = 307200 nbins = 16 factors = np.random.randint(256,size=(nobs,3)).copy() factors2 = factors.reshape(-1,480,3).copy()
def hist3(factorsin, nbins): if factorsin.ndim != 2: factors = factorsin.reshape(-1,factorsin.shape[-1]) else: factors = factorsin N, D = factors.shape darr = np.empty(factors.T.shape, dtype=int) nele = np.max(factors)+1 bins = np.arange(0, nele, nele/nbins) bins[-1] += 1 for i in range(D): darr[i] = np.digitize(factors[:,i],bins) - 1
#add weighted rows darrind = darr[D-1] for i in range(D-1): darrind += darr[i]*nbins**(D-i-1) return np.bincount(darrind) # return flat not reshaped
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Wed, May 6, 2009 at 6:06 PM, Chris Colbert
I decided to hold myself over until being able to take a hard look at the numpy histogramdd code:
Here is a quick thing a put together in cython. It's a 40x speedup over histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630, 3) array, this executed in 0.005 seconds on my machine.
This only works for arrays with uint8 data types having dimensions (x, y, 3) (common image format). The return array is a (16, 16, 16) equal width bin histogram of the input.
If anyone wants the cython C-output, let me know and I will email it to you.
If there is interest, I will extend this for different size bins and aliases for different data types.
Chris
import numpy as np
cimport numpy as np
DTYPE = np.uint8 DTYPE32 = np.int
ctypedef np.uint8_t DTYPE_t ctypedef np.int_t DTYPE_t32
def hist3d(np.ndarray[DTYPE_t, ndim=3] img): cdef int x = img.shape[0] cdef int y = img.shape[1] cdef int z = img.shape[2] cdef int addx cdef int addy cdef int addz cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], dtype=DTYPE32) cdef int i, j, v0, v1, v2
for i in range(x): for j in range(y): v0 = img[i, j, 0] v1 = img[i, j, 1] v2 = img[i, j, 2] addx = (v0 - (v0 % 16)) / 16 addy = (v1 - (v1 % 16)) / 16 addz = (v2 - (v2 % 16)) / 16 out[addx, addy, addz] += 1
return out
Thanks for the example for using cython. Once I figure out what the types are, cython will look very convenient for loops, and pyximport takes care of the compiler. Josef import pyximport; pyximport.install() import hist_rgb #name of .pyx files import numpy as np factors = np.random.randint(256,size=(480, 630, 3)) h = hist_rgb.hist3d(factors.astype(np.uint8)) print h[:,:,0]
i just realized I don't need the line:
cdef int z = img.shape(2)
it's left over from tinkering. sorry. And i should probably convert the out
array to type float to handle large data sets.
Chris
On Wed, May 6, 2009 at 7:30 PM,
I decided to hold myself over until being able to take a hard look at the numpy histogramdd code:
Here is a quick thing a put together in cython. It's a 40x speedup over histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630, 3) array, this executed in 0.005 seconds on my machine.
This only works for arrays with uint8 data types having dimensions (x, y,
On Wed, May 6, 2009 at 6:06 PM, Chris Colbert
wrote: 3) (common image format). The return array is a (16, 16, 16) equal width bin histogram of the input.
If anyone wants the cython C-output, let me know and I will email it to you.
If there is interest, I will extend this for different size bins and aliases for different data types.
Chris
import numpy as np
cimport numpy as np
DTYPE = np.uint8 DTYPE32 = np.int
ctypedef np.uint8_t DTYPE_t ctypedef np.int_t DTYPE_t32
def hist3d(np.ndarray[DTYPE_t, ndim=3] img): cdef int x = img.shape[0] cdef int y = img.shape[1] cdef int z = img.shape[2] cdef int addx cdef int addy cdef int addz cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], dtype=DTYPE32) cdef int i, j, v0, v1, v2
for i in range(x): for j in range(y): v0 = img[i, j, 0] v1 = img[i, j, 1] v2 = img[i, j, 2] addx = (v0 - (v0 % 16)) / 16 addy = (v1 - (v1 % 16)) / 16 addz = (v2 - (v2 % 16)) / 16 out[addx, addy, addz] += 1
return out
Thanks for the example for using cython. Once I figure out what the types are, cython will look very convenient for loops, and pyximport takes care of the compiler.
Josef
import pyximport; pyximport.install() import hist_rgb #name of .pyx files
import numpy as np factors = np.random.randint(256,size=(480, 630, 3)) h = hist_rgb.hist3d(factors.astype(np.uint8)) print h[:,:,0] _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Wed, May 6, 2009 at 7:39 PM, Chris Colbert
i just realized I don't need the line:
cdef int z = img.shape(2)
it's left over from tinkering. sorry. And i should probably convert the out array to type float to handle large data sets.
Chris
On Wed, May 6, 2009 at 7:30 PM,
wrote: On Wed, May 6, 2009 at 6:06 PM, Chris Colbert
wrote: I decided to hold myself over until being able to take a hard look at the numpy histogramdd code:
Here is a quick thing a put together in cython. It's a 40x speedup over histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630, 3) array, this executed in 0.005 seconds on my machine.
This only works for arrays with uint8 data types having dimensions (x, y, 3) (common image format). The return array is a (16, 16, 16) equal width bin histogram of the input.
If anyone wants the cython C-output, let me know and I will email it to you.
If there is interest, I will extend this for different size bins and aliases for different data types.
Chris
import numpy as np
cimport numpy as np
DTYPE = np.uint8 DTYPE32 = np.int
ctypedef np.uint8_t DTYPE_t ctypedef np.int_t DTYPE_t32
def hist3d(np.ndarray[DTYPE_t, ndim=3] img): cdef int x = img.shape[0] cdef int y = img.shape[1] cdef int z = img.shape[2] cdef int addx cdef int addy cdef int addz cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], dtype=DTYPE32) cdef int i, j, v0, v1, v2
for i in range(x): for j in range(y): v0 = img[i, j, 0] v1 = img[i, j, 1] v2 = img[i, j, 2] addx = (v0 - (v0 % 16)) / 16 addy = (v1 - (v1 % 16)) / 16 addz = (v2 - (v2 % 16)) / 16 out[addx, addy, addz] += 1
return out
Thanks for the example for using cython. Once I figure out what the types are, cython will look very convenient for loops, and pyximport takes care of the compiler.
Josef
import pyximport; pyximport.install() import hist_rgb #name of .pyx files
import numpy as np factors = np.random.randint(256,size=(480, 630, 3)) h = hist_rgb.hist3d(factors.astype(np.uint8)) print h[:,:,0]
playing some more with cython: here is a baby on the fly code generator input type int, output type float64 a dispatch function by type is missing no segfaults, even though most of the time a call the function with the wrong type. Josef code = ''' import numpy as np cimport numpy as np __all__ = ["hist3d"] DTYPE = ${imgtype} DTYPE32 = ${outtype} ctypedef ${imgtype}_t DTYPE_t ctypedef ${outtype}_t DTYPE_t32 def hist3d(np.ndarray[DTYPE_t, ndim=3] img): cdef int x = img.shape[0] cdef int y = img.shape[1] #cdef int z = img.shape[2] cdef int addx cdef int addy cdef int addz cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], dtype=DTYPE32) cdef int i, j, v0, v1, v2 for i in range(x): for j in range(y): v0 = img[i, j, 0] v1 = img[i, j, 1] v2 = img[i, j, 2] addx = (v0 - (v0 % 16)) / 16 addy = (v1 - (v1 % 16)) / 16 addz = (v2 - (v2 % 16)) / 16 out[addx, addy, addz] += 1 return out ''' from string import Template s = Template(code) src = s.substitute({'imgtype': 'np.int', 'outtype': 'np.float64'}) open('histrgbintfl2.pyx','w').write(src) import pyximport; pyximport.install() import histrgbintfl2 import numpy as np factors = np.random.randint(256,size=(480, 630, 3)) h = histrgbintfl2.hist3d(factors)#.astype(np.uint8)) print h[:,:,0]
nice!
This was really my first attempt at doing anything constructive with Cython.
It was actually unbelievably easy to work with. I think i spent less time
working on this, than I did trying to find an optimized solution using pure
numpy and python.
Chris
On Wed, May 6, 2009 at 8:21 PM,
i just realized I don't need the line:
cdef int z = img.shape(2)
it's left over from tinkering. sorry. And i should probably convert the out array to type float to handle large data sets.
Chris
On Wed, May 6, 2009 at 7:30 PM,
wrote: On Wed, May 6, 2009 at 6:06 PM, Chris Colbert
wrote:
I decided to hold myself over until being able to take a hard look at the numpy histogramdd code:
Here is a quick thing a put together in cython. It's a 40x speedup over histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630,
On Wed, May 6, 2009 at 7:39 PM, Chris Colbert
wrote: 3) array, this executed in 0.005 seconds on my machine.
This only works for arrays with uint8 data types having dimensions (x, y, 3) (common image format). The return array is a (16, 16, 16) equal width bin histogram of the input.
If anyone wants the cython C-output, let me know and I will email it to you.
If there is interest, I will extend this for different size bins and aliases for different data types.
Chris
import numpy as np
cimport numpy as np
DTYPE = np.uint8 DTYPE32 = np.int
ctypedef np.uint8_t DTYPE_t ctypedef np.int_t DTYPE_t32
def hist3d(np.ndarray[DTYPE_t, ndim=3] img): cdef int x = img.shape[0] cdef int y = img.shape[1] cdef int z = img.shape[2] cdef int addx cdef int addy cdef int addz cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], dtype=DTYPE32) cdef int i, j, v0, v1, v2
for i in range(x): for j in range(y): v0 = img[i, j, 0] v1 = img[i, j, 1] v2 = img[i, j, 2] addx = (v0 - (v0 % 16)) / 16 addy = (v1 - (v1 % 16)) / 16 addz = (v2 - (v2 % 16)) / 16 out[addx, addy, addz] += 1
return out
Thanks for the example for using cython. Once I figure out what the types are, cython will look very convenient for loops, and pyximport takes care of the compiler.
Josef
import pyximport; pyximport.install() import hist_rgb #name of .pyx files
import numpy as np factors = np.random.randint(256,size=(480, 630, 3)) h = hist_rgb.hist3d(factors.astype(np.uint8)) print h[:,:,0]
playing some more with cython: here is a baby on the fly code generator input type int, output type float64
a dispatch function by type is missing
no segfaults, even though most of the time a call the function with the wrong type.
Josef
code = ''' import numpy as np cimport numpy as np
__all__ = ["hist3d"] DTYPE = ${imgtype} DTYPE32 = ${outtype}
ctypedef ${imgtype}_t DTYPE_t ctypedef ${outtype}_t DTYPE_t32
def hist3d(np.ndarray[DTYPE_t, ndim=3] img): cdef int x = img.shape[0] cdef int y = img.shape[1] #cdef int z = img.shape[2] cdef int addx cdef int addy cdef int addz cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], dtype=DTYPE32) cdef int i, j, v0, v1, v2
for i in range(x): for j in range(y): v0 = img[i, j, 0] v1 = img[i, j, 1] v2 = img[i, j, 2] addx = (v0 - (v0 % 16)) / 16 addy = (v1 - (v1 % 16)) / 16 addz = (v2 - (v2 % 16)) / 16 out[addx, addy, addz] += 1
return out '''
from string import Template s = Template(code) src = s.substitute({'imgtype': 'np.int', 'outtype': 'np.float64'}) open('histrgbintfl2.pyx','w').write(src)
import pyximport; pyximport.install() import histrgbintfl2
import numpy as np factors = np.random.randint(256,size=(480, 630, 3)) h = histrgbintfl2.hist3d(factors)#.astype(np.uint8)) print h[:,:,0] _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
2009/5/7 Chris Colbert
This was really my first attempt at doing anything constructive with Cython. It was actually unbelievably easy to work with. I think i spent less time working on this, than I did trying to find an optimized solution using pure numpy and python.
One aspect we often overlook is how easy it is to write a for-loop in comparison to vectorisation. Besides, for-loops are sometimes easier to read as well! I think the Cython guys are planning some sort of templating, but I'll CC Dag so that he can tell us more. Regards Stéfan
Stéfan van der Walt wrote:
2009/5/7 Chris Colbert
: This was really my first attempt at doing anything constructive with Cython. It was actually unbelievably easy to work with. I think i spent less time working on this, than I did trying to find an optimized solution using pure numpy and python.
One aspect we often overlook is how easy it is to write a for-loop in comparison to vectorisation. Besides, for-loops are sometimes easier to read as well!
I think the Cython guys are planning some sort of templating, but I'll CC Dag so that he can tell us more.
We were discussing how it would/should look like, but noone's committed to implementing it so it's pretty much up in the blue I think -- someone might jump in and do it next week, or it might go another year, I can't tell. While I'm here, also note in that code Chris wrote that you want to pay attention to the change of default division semantics on Cython 0.12 (especially for speed). http://wiki.cython.org/enhancements/division -- Dag Sverre
Dag Sverre Seljebotn wrote:
Stéfan van der Walt wrote:
2009/5/7 Chris Colbert
: This was really my first attempt at doing anything constructive with Cython. It was actually unbelievably easy to work with. I think i spent less time working on this, than I did trying to find an optimized solution using pure numpy and python. One aspect we often overlook is how easy it is to write a for-loop in comparison to vectorisation. Besides, for-loops are sometimes easier to read as well!
I think the Cython guys are planning some sort of templating, but I'll CC Dag so that he can tell us more.
We were discussing how it would/should look like, but noone's committed to implementing it so it's pretty much up in the blue I think -- someone might jump in and do it next week, or it might go another year, I can't tell.
BTW the consensus pretty much ended on: cdef class MyClass[T](Ancestor): cdef T evaluate(T x): ... And then instantiate with cdef MyClass[int] obj = MyClass[int]() ... Only class templates would be targeted at first. -- Dag Sverre
2009/5/7 Dag Sverre Seljebotn
2009/5/7 Chris Colbert
: This was really my first attempt at doing anything constructive with Cython. It was actually unbelievably easy to work with. I think i spent less time working on this, than I did trying to find an optimized solution using
Stéfan van der Walt wrote: pure
numpy and python.
One aspect we often overlook is how easy it is to write a for-loop in comparison to vectorisation. Besides, for-loops are sometimes easier to read as well!
I think the Cython guys are planning some sort of templating, but I'll CC Dag so that he can tell us more.
We were discussing how it would/should look like, but noone's committed to implementing it so it's pretty much up in the blue I think -- someone might jump in and do it next week, or it might go another year, I can't tell.
While I'm here, also note in that code Chris wrote that you want to pay attention to the change of default division semantics on Cython 0.12 (especially for speed).
Hi Dag, Numpy can now do separate compilations with controlled export of symbols when the object files are linked together to make a module. Does Cython have anyway of controlling the visibility of symbols or should we just include the right files in Numpy to get the needed macros? Chuck
after looking at it for a while, I don't see a way to easily speed it up using pure numpy. As a matter of fact, the behavior shown below is a little confusing. Using fancy indexing, multiples of the same index are interpreted as a single call to that index, probably this a for a reason that I dont currently understand. I would think multiple calls to the same index would cause multiple increments in the example below. For the life of me, I can't think of how to do this 3d histogram in numpy without a for loop. Chris ########## example code #############
a = np.arange(0,8,1).reshape((2,2,2))
a
array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]])
indx = np.array([[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]) indx = indx.reshape(4,2,3)
a[indx[:,:,0], indx[:,:,1], indx[:,:,2]]+=1
a
array([[[2, 3], [4, 5]], [[6, 7], [8, 9]]])
indx2 = np.zeros((4,2,3)).astype(np.uint8)
a[indx2[:,:,0], indx2[:,:,1], indx2[:,:,2]]+=1
a
array([[[3, 3], [4, 5]], [[6, 7], [8, 9]]])
On Thu, May 7, 2009 at 5:35 PM, Charles R Harris
2009/5/7 Dag Sverre Seljebotn
2009/5/7 Chris Colbert
: This was really my first attempt at doing anything constructive with Cython. It was actually unbelievably easy to work with. I think i spent less time working on this, than I did trying to find an optimized solution using
Stéfan van der Walt wrote: pure
numpy and python.
One aspect we often overlook is how easy it is to write a for-loop in comparison to vectorisation. Besides, for-loops are sometimes easier to read as well!
I think the Cython guys are planning some sort of templating, but I'll CC Dag so that he can tell us more.
We were discussing how it would/should look like, but noone's committed to implementing it so it's pretty much up in the blue I think -- someone might jump in and do it next week, or it might go another year, I can't tell.
While I'm here, also note in that code Chris wrote that you want to pay attention to the change of default division semantics on Cython 0.12 (especially for speed).
Hi Dag,
Numpy can now do separate compilations with controlled export of symbols when the object files are linked together to make a module. Does Cython have anyway of controlling the visibility of symbols or should we just include the right files in Numpy to get the needed macros?
Chuck
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Charles R Harris wrote:
Hi Dag,
Numpy can now do separate compilations with controlled export of symbols when the object files are linked together to make a module. Does Cython have anyway of controlling the visibility of symbols or should we just include the right files in Numpy to get the needed macros?
I'll try an answer but in general it's better to ask these kind of questions on the Cython list; I'm not an expert on this part of Cython. If you refer to functions you create in Cython, they are static by default and not exported. If you declare them "public" then they are not made static. Finally, if you declare them "api" then they will be made static but a symbol table for the module in which they can be look up is exported (as a Python variable in the module; __pyx_c_api). Dag Sverre
participants (6)
-
Charles R Harris
-
Chris Colbert
-
Dag Sverre Seljebotn
-
David Huard
-
josef.pktd@gmail.com
-
Stéfan van der Walt