Hi everyone, I was hoping this list could point me in the direction of a more efficient solution to a problem I have. I have 4 vectors: x, y, z, and t that are about 1 million in length that describe the positions of photons. As my simulation progresses it updates the positions so x, y, z, and t change by an unknown (and unknowable) amount every update. This worked very well for its original purpose but now I need to calculate the photon density change over time. Currently after each update, I iterate over time slices, x slices, and y slices and then make an histogram of z which I then stitch together to create a density. However, this becomes very slow as the photons spread out in space and time. Does anyone know how to take such a large vector set and return a density efficiently?
Hi Chris! you could try a grid of unit cells that cover your phase space (x,y,z,t). Count the number of photons per unit cell of your initial configuration and track photons leaving and entering a particular cell. A dictionary with a tuple of x,y,z,t coordinates obtained from integer division of the x,y,z,t coordinates could serve as keys. Example for 2-D: from numpy import * # phase space in x,y x = arange(-100,100.1,.1) y = arange(-100,100.1,.1) # cell dimension in both dimensions the same GRID_WIDTH=7.5 # computes the grid key from x,y coordinates def gridKey(x,y): '''return the a tuple of x,y integer divided by GRID_WIDHT''' return (int(x // GRID_WIDTH), int(y // GRID_WIDTH)) # setup your grid dictionary gridLowX, gridHighX = gridKey(min(x), max(x)) gridLowY, gridHighY = gridKey(min(y), max(y)) keys = [(i,j) for i in xrange(gridLowX, gridHighX + 1) \ for j in xrange(gridLowY, gridHighY + 1)] grid = dict().fromkeys(keys, 0) # random photons photons = random.uniform(-100.,100., (100000,2)) # count photons in each grid cell for p in photons: grid[gridKey(*p)] += 1 ######################################### # in your simulation you have to keep track of where your photons # are going to... # (the code below won't run, it's just an example) ######################################### oldKey = gridKey(photon) propagate(photon) # changes x,y coordinates of photon newKey = gridKey(photon) if oldKey != newKey: grid[oldKey] -= 1 grid[newKey] += 1 I hope this helps! Bernhard On 6/15/07, Chris Lee <c.j.lee@tnw.utwente.nl> wrote:
Hi everyone,
I was hoping this list could point me in the direction of a more efficient solution to a problem I have.
I have 4 vectors: x, y, z, and t that are about 1 million in length that describe the positions of photons. As my simulation progresses it updates the positions so x, y, z, and t change by an unknown (and unknowable) amount every update.
This worked very well for its original purpose but now I need to calculate the photon density change over time. Currently after each update, I iterate over time slices, x slices, and y slices and then make an histogram of z which I then stitch together to create a density. However, this becomes very slow as the photons spread out in space and time.
Does anyone know how to take such a large vector set and return a density efficiently?
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
Hi Chris, Have you tried numpy.histogramdd ? If its still too slow, I have a fortran implementation on the back burner. I could try to finish it quickly and send you a preliminary version. Other thought: the kernel density estimator scipy.stats.gaussian_kde David 2007/6/17, Bernhard Voigt <bernhard.voigt@gmail.com>:
Hi Chris!
you could try a grid of unit cells that cover your phase space (x,y,z,t). Count the number of photons per unit cell of your initial configuration and track photons leaving and entering a particular cell. A dictionary with a tuple of x,y,z,t coordinates obtained from integer division of the x,y,z,t coordinates could serve as keys.
Example for 2-D:
from numpy import * # phase space in x,y x = arange(-100,100.1,.1) y = arange(-100,100.1,.1) # cell dimension in both dimensions the same GRID_WIDTH=7.5
# computes the grid key from x,y coordinates def gridKey(x,y): '''return the a tuple of x,y integer divided by GRID_WIDHT''' return (int(x // GRID_WIDTH), int(y // GRID_WIDTH))
# setup your grid dictionary gridLowX, gridHighX = gridKey(min(x), max(x)) gridLowY, gridHighY = gridKey(min(y), max(y)) keys = [(i,j) for i in xrange(gridLowX, gridHighX + 1) \ for j in xrange(gridLowY, gridHighY + 1)] grid = dict().fromkeys(keys, 0)
# random photons photons = random.uniform(-100.,100., (100000,2))
# count photons in each grid cell for p in photons: grid[gridKey(*p)] += 1
######################################### # in your simulation you have to keep track of where your photons # are going to... # (the code below won't run, it's just an example) ######################################### oldKey = gridKey(photon) propagate(photon) # changes x,y coordinates of photon newKey = gridKey(photon) if oldKey != newKey: grid[oldKey] -= 1 grid[newKey] += 1
I hope this helps! Bernhard
On 6/15/07, Chris Lee < c.j.lee@tnw.utwente.nl> wrote:
Hi everyone,
I was hoping this list could point me in the direction of a more efficient solution to a problem I have.
I have 4 vectors: x, y, z, and t that are about 1 million in length that describe the positions of photons. As my simulation progresses it updates the positions so x, y, z, and t change by an unknown (and unknowable) amount every update.
This worked very well for its original purpose but now I need to calculate the photon density change over time. Currently after each update, I iterate over time slices, x slices, and y slices and then make an histogram of z which I then stitch together to create a density. However, this becomes very slow as the photons spread out in space and time.
Does anyone know how to take such a large vector set and return a density efficiently?
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
Thank you both for your reply. Sorry for my lack of response I was at a conference and the webmail interface only gave me read only access :( histogramdd looks like it might do it. I had thought about consecutive 1D histograms but realised that it would not work. I had thought that I couldn't use a fixed grid because I require a large total volume and also to be able to resolve small distances. However, the David's idea looks pretty good so I will try that as well. Thank you both for your help. Cheers Chris David Huard wrote:
Hi Chris,
Have you tried numpy.histogramdd ? If its still too slow, I have a fortran implementation on the back burner. I could try to finish it quickly and send you a preliminary version.
Other thought: the kernel density estimator scipy.stats.gaussian_kde
David
2007/6/17, Bernhard Voigt <bernhard.voigt@gmail.com <mailto:bernhard.voigt@gmail.com>>:
Hi Chris!
you could try a grid of unit cells that cover your phase space (x,y,z,t). Count the number of photons per unit cell of your initial configuration and track photons leaving and entering a particular cell. A dictionary with a tuple of x,y,z,t coordinates obtained from integer division of the x,y,z,t coordinates could serve as keys.
Example for 2-D:
from numpy import * # phase space in x,y x = arange(-100,100.1,.1) y = arange(-100,100.1,.1) # cell dimension in both dimensions the same GRID_WIDTH=7.5
# computes the grid key from x,y coordinates def gridKey(x,y): '''return the a tuple of x,y integer divided by GRID_WIDHT''' return (int(x // GRID_WIDTH), int(y // GRID_WIDTH))
# setup your grid dictionary gridLowX, gridHighX = gridKey(min(x), max(x)) gridLowY, gridHighY = gridKey(min(y), max(y)) keys = [(i,j) for i in xrange(gridLowX, gridHighX + 1) \ for j in xrange(gridLowY, gridHighY + 1)] grid = dict().fromkeys(keys, 0)
# random photons photons = random.uniform(-100.,100., (100000,2))
# count photons in each grid cell for p in photons: grid[gridKey(*p)] += 1
######################################### # in your simulation you have to keep track of where your photons # are going to... # (the code below won't run, it's just an example) ######################################### oldKey = gridKey(photon) propagate(photon) # changes x,y coordinates of photon newKey = gridKey(photon) if oldKey != newKey: grid[oldKey] -= 1 grid[newKey] += 1
I hope this helps! Bernhard
On 6/15/07, * Chris Lee* < c.j.lee@tnw.utwente.nl <mailto:c.j.lee@tnw.utwente.nl>> wrote:
Hi everyone,
I was hoping this list could point me in the direction of a more efficient solution to a problem I have.
I have 4 vectors: x, y, z, and t that are about 1 million in length that describe the positions of photons. As my simulation progresses it updates the positions so x, y, z, and t change by an unknown (and unknowable) amount every update.
This worked very well for its original purpose but now I need to calculate the photon density change over time. Currently after each update, I iterate over time slices, x slices, and y slices and then make an histogram of z which I then stitch together to create a density. However, this becomes very slow as the photons spread out in space and time.
Does anyone know how to take such a large vector set and return a density efficiently?
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
------------------------------------------------------------------------
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
Hi David, histrogramdd does exactly what I want and seems to be fast enough. However, I think it may have an irregular bug if I have some data with a shape (x,4) where x varies from call to call and I specify 4 bin numbers bins=(binx, biny, binz, bint) and then call hist, edges = histrogramdd(data, bins=(binx, biny, binz, bint))) then most of the time hist will have the shape (binx, biny, binz, bint) but sometimes it will return a different shape e.g., (bint, binz, biny, binx) and the edges array order will be different again e.g., (x, y, t, z). Here is the chunk of code that generates this (with appropriate an appropriate non null data of course) densityHistogram, edges = numpy.histogramdd(data, bins=(xBins,yBins,zBins,tBins)) print densityHistogram.shape SHGPhotons = n.asarray(densityHistogram*0.01, n.int32) totalPhotons = SHGPhotons.sum() print SHGPhotons.shape if totalPhotons>0: idxSHG = n.nonzero(SHGPhotons) #print idxSHG xEdges = edges[0] yEdges = edges[1] zEdges = edges[2] tEdges = edges[3] #print zEdges print xEdges.shape, yEdges.shape, zEdges.shape, tEdges.shape Here is a typical (non error generating output) 20 20 1 1 <- number of bins in order (20, 20, 1, 1) <- shape of histogram (20, 20, 1, 1) <- shape of an array generated from the histrogram (21,) (21,) (2,) (2,) <- shape of the inidividual edge arrays in order here is the version that puts out an error 21 20 2 3 <- bins in order (3, 2, 21, 20) <- histogram is backwards (3, 2, 21, 20) <- as is the array generated from it (22,) (21,) (3,) (4,) <- edge arrays are in order though So, is this an error, or am I assuming too much about how histogramdd operates? I will start checking the order in the program, but this will never be perfect since the number of bins could be the same for multiple axis. Thanks for any advice Cheers Chris David Huard wrote:
Hi Chris,
Have you tried numpy.histogramdd ? If its still too slow, I have a fortran implementation on the back burner. I could try to finish it quickly and send you a preliminary version.
Other thought: the kernel density estimator scipy.stats.gaussian_kde
David
2007/6/17, Bernhard Voigt <bernhard.voigt@gmail.com <mailto:bernhard.voigt@gmail.com>>:
Hi Chris!
you could try a grid of unit cells that cover your phase space (x,y,z,t). Count the number of photons per unit cell of your initial configuration and track photons leaving and entering a particular cell. A dictionary with a tuple of x,y,z,t coordinates obtained from integer division of the x,y,z,t coordinates could serve as keys.
Example for 2-D:
from numpy import * # phase space in x,y x = arange(-100,100.1,.1) y = arange(-100,100.1,.1) # cell dimension in both dimensions the same GRID_WIDTH=7.5
# computes the grid key from x,y coordinates def gridKey(x,y): '''return the a tuple of x,y integer divided by GRID_WIDHT''' return (int(x // GRID_WIDTH), int(y // GRID_WIDTH))
# setup your grid dictionary gridLowX, gridHighX = gridKey(min(x), max(x)) gridLowY, gridHighY = gridKey(min(y), max(y)) keys = [(i,j) for i in xrange(gridLowX, gridHighX + 1) \ for j in xrange(gridLowY, gridHighY + 1)] grid = dict().fromkeys(keys, 0)
# random photons photons = random.uniform(-100.,100., (100000,2))
# count photons in each grid cell for p in photons: grid[gridKey(*p)] += 1
######################################### # in your simulation you have to keep track of where your photons # are going to... # (the code below won't run, it's just an example) ######################################### oldKey = gridKey(photon) propagate(photon) # changes x,y coordinates of photon newKey = gridKey(photon) if oldKey != newKey: grid[oldKey] -= 1 grid[newKey] += 1
I hope this helps! Bernhard
On 6/15/07, * Chris Lee* < c.j.lee@tnw.utwente.nl <mailto:c.j.lee@tnw.utwente.nl>> wrote:
Hi everyone,
I was hoping this list could point me in the direction of a more efficient solution to a problem I have.
I have 4 vectors: x, y, z, and t that are about 1 million in length that describe the positions of photons. As my simulation progresses it updates the positions so x, y, z, and t change by an unknown (and unknowable) amount every update.
This worked very well for its original purpose but now I need to calculate the photon density change over time. Currently after each update, I iterate over time slices, x slices, and y slices and then make an histogram of z which I then stitch together to create a density. However, this becomes very slow as the photons spread out in space and time.
Does anyone know how to take such a large vector set and return a density efficiently?
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
------------------------------------------------------------------------
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
Hi Chris, It's a bug alright, but I believe it has been fixed. What version of numpy are you using? If you're not using the latest release, please try it out and see if it still bugs. The bug had to do with the reordering of the flattened array into a N-D array, which is bin length dependent. Cheers, David 2007/6/28, Chris Lee <c.j.lee@tnw.utwente.nl>:
Hi David,
histrogramdd does exactly what I want and seems to be fast enough. However, I think it may have an irregular bug
if I have some data with a shape (x,4) where x varies from call to call and I specify 4 bin numbers bins=(binx, biny, binz, bint) and then call
hist, edges = histrogramdd(data, bins=(binx, biny, binz, bint)))
then most of the time hist will have the shape (binx, biny, binz, bint) but sometimes it will return a different shape e.g., (bint, binz, biny, binx) and the edges array order will be different again e.g., (x, y, t, z).
Here is the chunk of code that generates this (with appropriate an appropriate non null data of course)
densityHistogram, edges = numpy.histogramdd(data, bins=(xBins,yBins,zBins,tBins)) print densityHistogram.shape SHGPhotons = n.asarray(densityHistogram*0.01, n.int32) totalPhotons = SHGPhotons.sum() print SHGPhotons.shape if totalPhotons>0: idxSHG = n.nonzero(SHGPhotons) #print idxSHG xEdges = edges[0] yEdges = edges[1] zEdges = edges[2] tEdges = edges[3] #print zEdges print xEdges.shape, yEdges.shape, zEdges.shape, tEdges.shape
Here is a typical (non error generating output) 20 20 1 1 <- number of bins in order (20, 20, 1, 1) <- shape of histogram (20, 20, 1, 1) <- shape of an array generated from the histrogram (21,) (21,) (2,) (2,) <- shape of the inidividual edge arrays in order
here is the version that puts out an error 21 20 2 3 <- bins in order (3, 2, 21, 20) <- histogram is backwards (3, 2, 21, 20) <- as is the array generated from it (22,) (21,) (3,) (4,) <- edge arrays are in order though
So, is this an error, or am I assuming too much about how histogramdd operates?
I will start checking the order in the program, but this will never be perfect since the number of bins could be the same for multiple axis.
Thanks for any advice Cheers Chris
David Huard wrote:
Hi Chris,
Have you tried numpy.histogramdd ? If its still too slow, I have a fortran implementation on the back burner. I could try to finish it quickly and send you a preliminary version.
Other thought: the kernel density estimator scipy.stats.gaussian_kde
David
2007/6/17, Bernhard Voigt <bernhard.voigt@gmail.com <mailto:bernhard.voigt@gmail.com>>:
Hi Chris!
you could try a grid of unit cells that cover your phase space (x,y,z,t). Count the number of photons per unit cell of your initial configuration and track photons leaving and entering a particular cell. A dictionary with a tuple of x,y,z,t coordinates obtained from integer division of the x,y,z,t coordinates could serve as keys.
Example for 2-D:
from numpy import * # phase space in x,y x = arange(-100,100.1,.1) y = arange(-100,100.1,.1) # cell dimension in both dimensions the same GRID_WIDTH=7.5
# computes the grid key from x,y coordinates def gridKey(x,y): '''return the a tuple of x,y integer divided by GRID_WIDHT''' return (int(x // GRID_WIDTH), int(y // GRID_WIDTH))
# setup your grid dictionary gridLowX, gridHighX = gridKey(min(x), max(x)) gridLowY, gridHighY = gridKey(min(y), max(y)) keys = [(i,j) for i in xrange(gridLowX, gridHighX + 1) \ for j in xrange(gridLowY, gridHighY + 1)] grid = dict().fromkeys(keys, 0)
# random photons photons = random.uniform(-100.,100., (100000,2))
# count photons in each grid cell for p in photons: grid[gridKey(*p)] += 1
######################################### # in your simulation you have to keep track of where your photons # are going to... # (the code below won't run, it's just an example) ######################################### oldKey = gridKey(photon) propagate(photon) # changes x,y coordinates of photon newKey = gridKey(photon) if oldKey != newKey: grid[oldKey] -= 1 grid[newKey] += 1
I hope this helps! Bernhard
On 6/15/07, * Chris Lee* < c.j.lee@tnw.utwente.nl <mailto:c.j.lee@tnw.utwente.nl>> wrote:
Hi everyone,
I was hoping this list could point me in the direction of a more efficient solution to a problem I have.
I have 4 vectors: x, y, z, and t that are about 1 million in length that describe the positions of photons. As my simulation progresses it updates the positions so x, y, z, and t change by an unknown (and unknowable) amount every update.
This worked very well for its original purpose but now I need to calculate the photon density change over time. Currently after each update, I iterate over time slices, x slices, and y slices and then make an histogram of z which I then stitch together to create a density. However, this becomes very slow as the photons spread out in space and time.
Does anyone know how to take such a large vector set and return a density efficiently?
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
------------------------------------------------------------------------
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
That would be the problem. I just tested it on a machine with an up to date version of numpy and the problem went away. Out of curiosity (and perhaps practically since I have no power to upgrade the other machine I was using) is the buggy output always a transpose of the correct output? Cheers Chris On Jun 28, 2007, at 5:31 PM, David Huard wrote:
Hi Chris,
It's a bug alright, but I believe it has been fixed. What version of numpy are you using? If you're not using the latest release, please try it out and see if it still bugs. The bug had to do with the reordering of the flattened array into a N-D array, which is bin length dependent.
Cheers, David
2007/6/28, Chris Lee <c.j.lee@tnw.utwente.nl>: Hi David,
histrogramdd does exactly what I want and seems to be fast enough. However, I think it may have an irregular bug
if I have some data with a shape (x,4) where x varies from call to call and I specify 4 bin numbers bins=(binx, biny, binz, bint) and then call
hist, edges = histrogramdd(data, bins=(binx, biny, binz, bint)))
then most of the time hist will have the shape (binx, biny, binz, bint) but sometimes it will return a different shape e.g., (bint, binz, biny, binx) and the edges array order will be different again e.g., (x, y, t, z).
Here is the chunk of code that generates this (with appropriate an appropriate non null data of course)
densityHistogram, edges = numpy.histogramdd(data, bins=(xBins,yBins,zBins,tBins)) print densityHistogram.shape SHGPhotons = n.asarray(densityHistogram*0.01, n.int32) totalPhotons = SHGPhotons.sum() print SHGPhotons.shape if totalPhotons>0: idxSHG = n.nonzero(SHGPhotons) #print idxSHG xEdges = edges[0] yEdges = edges[1] zEdges = edges[2] tEdges = edges[3] #print zEdges print xEdges.shape, yEdges.shape, zEdges.shape, tEdges.shape
Here is a typical (non error generating output) 20 20 1 1 <- number of bins in order (20, 20, 1, 1) <- shape of histogram (20, 20, 1, 1) <- shape of an array generated from the histrogram (21,) (21,) (2,) (2,) <- shape of the inidividual edge arrays in order
here is the version that puts out an error 21 20 2 3 <- bins in order (3, 2, 21, 20) <- histogram is backwards (3, 2, 21, 20) <- as is the array generated from it (22,) (21,) (3,) (4,) <- edge arrays are in order though
So, is this an error, or am I assuming too much about how histogramdd operates?
I will start checking the order in the program, but this will never be perfect since the number of bins could be the same for multiple axis.
Thanks for any advice Cheers Chris
Hi Chris,
Have you tried numpy.histogramdd ? If its still too slow, I have a fortran implementation on the back burner. I could try to finish it quickly and send you a preliminary version.
Other thought: the kernel density estimator scipy.stats.gaussian_kde
David
2007/6/17, Bernhard Voigt <bernhard.voigt@gmail.com <mailto: bernhard.voigt@gmail.com>>:
Hi Chris!
you could try a grid of unit cells that cover your phase space (x,y,z,t). Count the number of photons per unit cell of your initial configuration and track photons leaving and entering a particular cell. A dictionary with a tuple of x,y,z,t coordinates obtained from integer division of the x,y,z,t coordinates could serve as keys.
Example for 2-D:
from numpy import * # phase space in x,y x = arange(-100,100.1,.1) y = arange(-100,100.1,.1) # cell dimension in both dimensions the same GRID_WIDTH=7.5
# computes the grid key from x,y coordinates def gridKey(x,y): '''return the a tuple of x,y integer divided by GRID_WIDHT''' return (int(x // GRID_WIDTH), int(y // GRID_WIDTH))
# setup your grid dictionary gridLowX, gridHighX = gridKey(min(x), max(x)) gridLowY, gridHighY = gridKey(min(y), max(y)) keys = [(i,j) for i in xrange(gridLowX, gridHighX + 1) \ for j in xrange(gridLowY, gridHighY + 1)] grid = dict().fromkeys(keys, 0)
# random photons photons = random.uniform(-100.,100., (100000,2))
# count photons in each grid cell for p in photons: grid[gridKey(*p)] += 1
######################################### # in your simulation you have to keep track of where your
David Huard wrote: photons
# are going to... # (the code below won't run, it's just an example) ######################################### oldKey = gridKey(photon) propagate(photon) # changes x,y coordinates of photon newKey = gridKey(photon) if oldKey != newKey: grid[oldKey] -= 1 grid[newKey] += 1
I hope this helps! Bernhard
On 6/15/07, * Chris Lee* < c.j.lee@tnw.utwente.nl <mailto: c.j.lee@tnw.utwente.nl>> wrote:
Hi everyone,
I was hoping this list could point me in the direction of
a more
efficient solution to a problem I have.
I have 4 vectors: x, y, z, and t that are about 1 million in length that describe the positions of photons. As my simulation progresses it updates the positions so x, y, z, and t change by an unknown (and unknowable) amount every update.
This worked very well for its original purpose but now I
need to
calculate the photon density change over time. Currently after each update, I iterate over time slices, x slices, and y
slices and
then make an histogram of z which I then stitch together to
create a
density. However, this becomes very slow as the photons spread out in space and time.
Does anyone know how to take such a large vector set and
return a
density efficiently?
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
---------------------------------------------------------------------- --
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
2007/6/28, Chris Lee <c.j.lee@tnw.utwente.nl>:
That would be the problem. I just tested it on a machine with an up to date version of numpy and the problem went away.
Good.
Out of curiosity (and perhaps practically since I have no power to upgrade the other machine I was using) is the buggy output always a transpose of the correct output?
No, it's much worse than that. It's like ordering [1,2,3,4,5,6] to a (2x3) matrix or a (3x2) matrix [[1,2,3] [4,5,6]] [[1,2], [3,4], [5,6]] One is not the transpose of the other. Sorry about that. You could cut and paste the fixed code and import histogramdd from a local module instead of from numpy. David Cheers
Chris On Jun 28, 2007, at 5:31 PM, David Huard wrote:
Hi Chris,
It's a bug alright, but I believe it has been fixed. What version of numpy are you using? If you're not using the latest release, please try it out and see if it still bugs. The bug had to do with the reordering of the flattened array into a N-D array, which is bin length dependent.
Cheers, David
2007/6/28, Chris Lee <c.j.lee@tnw.utwente.nl>:
Hi David,
histrogramdd does exactly what I want and seems to be fast enough. However, I think it may have an irregular bug
if I have some data with a shape (x,4) where x varies from call to call and I specify 4 bin numbers bins=(binx, biny, binz, bint) and then call
hist, edges = histrogramdd(data, bins=(binx, biny, binz, bint)))
then most of the time hist will have the shape (binx, biny, binz, bint) but sometimes it will return a different shape e.g., (bint, binz, biny, binx) and the edges array order will be different again e.g., (x, y, t, z).
Here is the chunk of code that generates this (with appropriate an appropriate non null data of course)
densityHistogram, edges = numpy.histogramdd(data, bins=(xBins,yBins,zBins,tBins)) print densityHistogram.shape SHGPhotons = n.asarray(densityHistogram*0.01, n.int32) totalPhotons = SHGPhotons.sum() print SHGPhotons.shape if totalPhotons>0: idxSHG = n.nonzero(SHGPhotons) #print idxSHG xEdges = edges[0] yEdges = edges[1] zEdges = edges[2] tEdges = edges[3] #print zEdges print xEdges.shape, yEdges.shape, zEdges.shape, tEdges.shape
Here is a typical (non error generating output) 20 20 1 1 <- number of bins in order (20, 20, 1, 1) <- shape of histogram (20, 20, 1, 1) <- shape of an array generated from the histrogram (21,) (21,) (2,) (2,) <- shape of the inidividual edge arrays in order
here is the version that puts out an error 21 20 2 3 <- bins in order (3, 2, 21, 20) <- histogram is backwards (3, 2, 21, 20) <- as is the array generated from it (22,) (21,) (3,) (4,) <- edge arrays are in order though
So, is this an error, or am I assuming too much about how histogramdd operates?
I will start checking the order in the program, but this will never be perfect since the number of bins could be the same for multiple axis.
Thanks for any advice Cheers Chris
David Huard wrote:
Hi Chris,
Have you tried numpy.histogramdd ? If its still too slow, I have a fortran implementation on the back burner. I could try to finish it quickly and send you a preliminary version.
Other thought: the kernel density estimator scipy.stats.gaussian_kde
David
2007/6/17, Bernhard Voigt <bernhard.voigt@gmail.com <mailto: bernhard.voigt@gmail.com>>:
Hi Chris!
you could try a grid of unit cells that cover your phase space (x,y,z,t). Count the number of photons per unit cell of your initial configuration and track photons leaving and entering a particular cell. A dictionary with a tuple of x,y,z,t coordinates obtained from integer division of the x,y,z,t coordinates could serve as keys.
Example for 2-D:
from numpy import * # phase space in x,y x = arange(-100,100.1,.1) y = arange(-100,100.1,.1) # cell dimension in both dimensions the same GRID_WIDTH=7.5
# computes the grid key from x,y coordinates def gridKey(x,y): '''return the a tuple of x,y integer divided by GRID_WIDHT''' return (int(x // GRID_WIDTH), int(y // GRID_WIDTH))
# setup your grid dictionary gridLowX, gridHighX = gridKey(min(x), max(x)) gridLowY, gridHighY = gridKey(min(y), max(y)) keys = [(i,j) for i in xrange(gridLowX, gridHighX + 1) \ for j in xrange(gridLowY, gridHighY + 1)] grid = dict().fromkeys(keys, 0)
# random photons photons = random.uniform(-100.,100., (100000,2))
# count photons in each grid cell for p in photons: grid[gridKey(*p)] += 1
######################################### # in your simulation you have to keep track of where your photons # are going to... # (the code below won't run, it's just an example) ######################################### oldKey = gridKey(photon) propagate(photon) # changes x,y coordinates of photon newKey = gridKey(photon) if oldKey != newKey: grid[oldKey] -= 1 grid[newKey] += 1
I hope this helps! Bernhard
On 6/15/07, * Chris Lee* < c.j.lee@tnw.utwente.nl <mailto: c.j.lee@tnw.utwente.nl>> wrote:
Hi everyone,
I was hoping this list could point me in the direction of a more efficient solution to a problem I have.
I have 4 vectors: x, y, z, and t that are about 1 million in length that describe the positions of photons. As my simulation progresses it updates the positions so x, y, z, and t change by an unknown (and unknowable) amount every update.
This worked very well for its original purpose but now I need to calculate the photon density change over time. Currently after each update, I iterate over time slices, x slices, and y slices and then make an histogram of z which I then stitch together to create a density. However, this becomes very slow as the photons spread out in space and time.
Does anyone know how to take such a large vector set and return a density efficiently?
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org <mailto:SciPy-user@scipy.org> http://projects.scipy.org/mailman/listinfo/scipy-user
------------------------------------------------------------------------
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
participants (3)
-
Bernhard Voigt -
Chris Lee -
David Huard