Iterative Matrix Multiplication
Hi, I have a list of vec3 lists (e.g. [[1,2,3],[4,5,6],[7,8,9],...]). To every single one of the vec3 sublists, I am currently applying transformations. I need to optimize this with numpy. To get proper results, as far as I can tell, the vec3 lists must be expressed as vec4s: [[1,2,3],[4,5,6],[7,8,9],...] -> [[1,2,3,1],[4,5,6,1],[7,8,9,1],...]. Each of these needs to be multiplied by either a translation matrix, or a rotation and translation matrix. I don't really know how to approach the problem . . . Thanks, Ian
On Sun, Feb 28, 2010 at 7:53 PM, Ian Mallett <geometrian@gmail.com> wrote:
Hi,
I have a list of vec3 lists (e.g. [[1,2,3],[4,5,6],[7,8,9],...]). To every single one of the vec3 sublists, I am currently applying transformations. I need to optimize this with numpy.
To get proper results, as far as I can tell, the vec3 lists must be expressed as vec4s: [[1,2,3],[4,5,6],[7,8,9],...] -> [[1,2,3,1],[4,5,6,1],[7,8,9,1],...]. Each of these needs to be multiplied by either a translation matrix, or a rotation and translation matrix.
I don't really know how to approach the problem . . .
I'm not sure what exactly you need but it sounds similar to "Row-wise dot product" in numpy-discussion Sept 2009 there are several threads on rotation matrix, which might be useful depending on the structure of your arrays Josef
Thanks, Ian
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Sun, Feb 28, 2010 at 5:53 PM, Ian Mallett <geometrian@gmail.com> wrote:
Hi,
I have a list of vec3 lists (e.g. [[1,2,3],[4,5,6],[7,8,9],...]). To every single one of the vec3 sublists, I am currently applying transformations. I need to optimize this with numpy.
To get proper results, as far as I can tell, the vec3 lists must be expressed as vec4s: [[1,2,3],[4,5,6],[7,8,9],...] -> [[1,2,3,1],[4,5,6,1],[7,8,9,1],...]. Each of these needs to be multiplied by either a translation matrix, or a rotation and translation matrix.
I don't really know how to approach the problem . . .
As I understand it, you want *different* matrices applied to each vector? There are generalized ufuncs, which I haven't tried, but for small vectors there is a trick. Let's see... heck, gmane looks to be dead at the moment. Anyway, I posted the method on the list here a couple of years ago and I'll put up a link if I can find it when gmane comes back. Chuck
On Sun, Feb 28, 2010 at 6:31 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
As I understand it, you want *different* matrices applied to each vector?
Nope--I need the same matrix applied to each vector. Because 3D translation matrices must, if I understand correctly be 4x4, the vectors must first be changed to length 4 (adding a 1 for the last term). Then, the matrices would be applied. Then, the resulting n*4 array would be converted back into a n*3 array (simply by dropping the last term). Ian
On Sun, Feb 28, 2010 at 7:35 PM, Ian Mallett <geometrian@gmail.com> wrote:
On Sun, Feb 28, 2010 at 6:31 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
As I understand it, you want *different* matrices applied to each vector?
Nope--I need the same matrix applied to each vector.
Because 3D translation matrices must, if I understand correctly be 4x4, the vectors must first be changed to length 4 (adding a 1 for the last term). Then, the matrices would be applied. Then, the resulting n*4 array would be converted back into a n*3 array (simply by dropping the last term).
Why not just add a vector to get translation? There is no need to go the homogeneous form. Or you can just leave the vectors at length 4 and use a slice to access the first three components. That way you can leave the ones in place. Chuck
On Sun, Feb 28, 2010 at 6:54 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
Why not just add a vector to get translation? There is no need to go the homogeneous form. Or you can just leave the vectors at length 4 and use a slice to access the first three components. That way you can leave the ones in place.
Oh . . . duh :D Excellent--and a 3D rotation matrix is 3x3--so the list can remain n*3. Now the question is how to apply a rotation matrix to the array of vec3?
Chuck
Ian
On Sun, Feb 28, 2010 at 7:58 PM, Ian Mallett <geometrian@gmail.com> wrote:
On Sun, Feb 28, 2010 at 6:54 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
Why not just add a vector to get translation? There is no need to go the homogeneous form. Or you can just leave the vectors at length 4 and use a slice to access the first three components. That way you can leave the ones in place.
Oh . . . duh :D
Excellent--and a 3D rotation matrix is 3x3--so the list can remain n*3. Now the question is how to apply a rotation matrix to the array of vec3?
It looks like you want something like res = dot(vec, rot) + tran You can avoid an extra copy being made by separating the parts res = dot(vec, rot) res += tran where I've used arrays, not matrices. Note that the rotation matrix multiplies every vector in the array. Chuck
2010/3/1 Charles R Harris <charlesr.harris@gmail.com>:
On Sun, Feb 28, 2010 at 7:58 PM, Ian Mallett <geometrian@gmail.com> wrote:
Excellent--and a 3D rotation matrix is 3x3--so the list can remain n*3. Now the question is how to apply a rotation matrix to the array of vec3?
It looks like you want something like
res = dot(vec, rot) + tran
You can avoid an extra copy being made by separating the parts
res = dot(vec, rot) res += tran
where I've used arrays, not matrices. Note that the rotation matrix multiplies every vector in the array.
When you want to rotate a ndarray "list" of vectors:
a.shape (N, 3)
a [[1., 2., 3. ] [4., 5., 6. ]]
by some rotation matrix:
rotation_matrix.shape (3, 3)
where each row of the rotation_matrix represents one vector of the rotation target basis, expressed in the basis of the original system, you can do this by writing:
numpy.dot(a, rotations_matrix) ,
as Chuck pointed out. This gives you the rotated vectors in an ndarray "list" again:
numpy.dot(a, rotation_matrix).shape (N, 3)
This is just somewhat more in detail what Chuck already stated
Note that the rotation matrix multiplies every vector in the array.
my 2 cents, Friedrich
This is how I always do it: In [1]: import numpy as np In [3]: tmat = np.array([[0., 1., 0., 5.],[0., 0., 1., 3.],[1., 0., 0., 2.]]) In [4]: tmat Out[4]: array([[ 0., 1., 0., 5.], [ 0., 0., 1., 3.], [ 1., 0., 0., 2.]]) In [5]: points = np.random.random((5, 3)) In [7]: hpoints = np.column_stack((points, np.ones(len(points)))) In [9]: hpoints Out[9]: array([[ 0.17437059, 0.38693627, 0.201047 , 1. ], [ 0.99712373, 0.16958721, 0.03050696, 1. ], [ 0.30653326, 0.62037744, 0.35785282, 1. ], [ 0.78936771, 0.93692711, 0.58138493, 1. ], [ 0.29914065, 0.08808239, 0.72032172, 1. ]]) In [10]: np.dot(tmat, hpoints.T).T Out[10]: array([[ 5.38693627, 3.201047 , 2.17437059], [ 5.16958721, 3.03050696, 2.99712373], [ 5.62037744, 3.35785282, 2.30653326], [ 5.93692711, 3.58138493, 2.78936771], [ 5.08808239, 3.72032172, 2.29914065]]) On Mon, Mar 1, 2010 at 6:12 AM, Friedrich Romstedt < friedrichromstedt@gmail.com> wrote:
2010/3/1 Charles R Harris <charlesr.harris@gmail.com>:
On Sun, Feb 28, 2010 at 7:58 PM, Ian Mallett <geometrian@gmail.com> wrote:
Excellent--and a 3D rotation matrix is 3x3--so the list can remain n*3. Now the question is how to apply a rotation matrix to the array of vec3?
It looks like you want something like
res = dot(vec, rot) + tran
You can avoid an extra copy being made by separating the parts
res = dot(vec, rot) res += tran
where I've used arrays, not matrices. Note that the rotation matrix multiplies every vector in the array.
When you want to rotate a ndarray "list" of vectors:
a.shape (N, 3)
a [[1., 2., 3. ] [4., 5., 6. ]]
by some rotation matrix:
rotation_matrix.shape (3, 3)
where each row of the rotation_matrix represents one vector of the rotation target basis, expressed in the basis of the original system,
you can do this by writing:
numpy.dot(a, rotations_matrix) ,
as Chuck pointed out.
This gives you the rotated vectors in an ndarray "list" again:
numpy.dot(a, rotation_matrix).shape (N, 3)
This is just somewhat more in detail what Chuck already stated
Note that the rotation matrix multiplies every vector in the array.
my 2 cents, Friedrich _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Excellent--this setup works perfectly! In the areas I was concentrating on, the the speed increased an order of magnitude. However, the overall speed seems to have dropped. I believe this may be because the heavy indexing that follows on the result is slower in numpy. Is this a correct analysis? What would be amazing is if I could port the other half of the algorithm to numpy too . . . The algorithm is for light volume extrusion of geometry. Pseudocode of the rest of the algorithm: 1 v_array #array of beautifully transformed vertices from last step 2 n_array #array of beautifully transformed normals from last step 3 f_list #list of vec4s, where every vec4 is three vertex indices and a 4 #normal index. These indices together describe a triangle-- 5 #three vertex indices into "v_array", and one normal from "n_array". 6 edges = [] 7 for v1index,v2index,v3index,nindex in f_list: 8 v1,v2,v3 = [v_array[i] for i in [vi1index,v2index,v3index]] 9 if angle_between(n_array[nindex],v1-a_constant_point)<90deg: 10 for edge in [[v1index,v2index],[v2index,v3index],[v3index,v1index]]: 11 #add "edge" to "edges" 12 #remove pairs of identical edges (also note that things like 13 #[831,326] are the same as [326,831]) 14 edges2 = [] 15 for edge in edges: 16 edges2.append(v_array[edge[0]],v_array[edge[1]]) 17 return edges2 If that needs clarification, let me know. The goal with this is obviously to remove as many looping operations as possible. I think the slowdown may be in lines 8, 9, and 16, as these are the lines that index into v_array or n_array. In line 9, the normal "n_array[nindex]" is tested against any vector from a vertex (in this case "v1") to a constant point (here, the light's position) see if it is less than 90deg--that is, that the triangle's front is towards the light. I thought this might be a particularly good candidate for a boolean array? The edge pair removal is something that I have worked fairly extensively on. By testing and removing pairs as they are added (line 11), a bit more performance can be gained. I have put it here in 12-13 because I'm not sure this can be done in numpy. My current algorithm works using Python's standard sets, and using the "edge" as a key. If an edge is already in the set of edges (in the same or reversed form), then the edge is removed from the set. I may be able to do something with lines 14-17 myself--but I'm not sure. If my code above can't be simplified using numpy, is there a way to efficiently convert numpy arrays back to standard Python lists? As mentioned before, I'm certainly no pro with numpy--but I am learning! Thanks! Ian
I'm learning too by answering your questions. 2010/3/2 Ian Mallett <geometrian@gmail.com>:
1 v_array #array of beautifully transformed vertices from last step 2 n_array #array of beautifully transformed normals from last step 3 f_list #list of vec4s, where every vec4 is three vertex indices and a 4 #normal index. These indices together describe a triangle-- 5 #three vertex indices into "v_array", and one normal from "n_array". 6 edges = [] 7 for v1index,v2index,v3index,nindex in f_list: 8 v1,v2,v3 = [v_array[i] for i in [vi1index,v2index,v3index]] 9 if angle_between(n_array[nindex],v1-a_constant_point)<90deg: 10 for edge in [[v1index,v2index],[v2index,v3index],[v3index,v1index]]: 11 #add "edge" to "edges" 12 #remove pairs of identical edges (also note that things like 13 #[831,326] are the same as [326,831]) 14 edges2 = [] 15 for edge in edges: 16 edges2.append(v_array[edge[0]],v_array[edge[1]]) 17 return edges2
The loop I can replace by numpy operations:
v_array array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) n_array array([[ 0.1, 0.2, 0.3], [ 0.4, 0.5, 0.6]]) f_list array([[0, 1, 2, 0], [2, 1, 0, 1]])
Retrieving the v1 vectors:
v1s = v_array[f_list[:, 0]] v1s array([[1, 2, 3], [7, 8, 9]])
Retrieving the normal vectors:
ns = n_array[f_list[:, 3]] ns array([[ 0.1, 0.2, 0.3], [ 0.4, 0.5, 0.6]])
Now how to calculate the pairwise dot product (I suppress the difference of v1 to some_point for now):
inner = numpy.inner(ns, v1s) inner array([[ 1.4, 5. ], [ 3.2, 12.2]])
This calculates *all* pairwise dot products, we have to select the diagonal of this square ndarray:
dotprods = inner[[numpy.arange(0, 2), numpy.arange(0, 2)]] dotprods array([ 1.4, 12.2])
Now we can create a boolean array saying where the dotprod is > 0 (i.e, angle < 90°), and select those triangles:
select = dotprods > 0 select array([ True, True], dtype=bool) selected = f_list[select] selected array([[0, 1, 2, 0], [2, 1, 0, 1]])
In this case it's the full list. Now build the triangles corners array:
corners = v_array[selected[:, :3]] corners array([[[1, 2, 3], [4, 5, 6], [7, 8, 9]],
[[7, 8, 9], [4, 5, 6], [1, 2, 3]]])
This has indices [triangle, vertex number (0, 1, or 2), xyz]. And compute the edges (I think you can make use of them):
edges_dist = numpy.asarray([corners[:, 1] - corners[:, 0], corners[:, 2] - corners[:, 0], corners[:, 2] - corners[:, 1]]) edges_dist array([[[ 3, 3, 3], [-3, -3, -3]],
[[ 6, 6, 6], [-6, -6, -6]], [[ 3, 3, 3], [-3, -3, -3]]]) This has indices [corner number, triangle, xyz]. I think it's easier to compare then "reversed" edges, because then edge[i, j] == -edge[k, l]? But of course:
edges = numpy.asarray([[corners[:, 0], corners[:, 1]], [corners[:, 1], corners[:, 2]], [corners[:, 2], corners[:, 0]]]) edges array([[[[1, 2, 3], [7, 8, 9]],
[[4, 5, 6], [4, 5, 6]]], [[[4, 5, 6], [4, 5, 6]], [[7, 8, 9], [1, 2, 3]]], [[[7, 8, 9], [1, 2, 3]], [[1, 2, 3], [7, 8, 9]]]]) This has indices [edge number (0, 1, or 2), corner number in edge (0 or 1), triangle]. But this may not be what you want (not flattened in triangle number). Therefore:
edges0 = numpy.asarray([corners[:, 0], corners[:, 1]]) edges1 = numpy.asarray([corners[:, 1], corners[:, 2]]) edges2 = numpy.asarray([corners[:, 2], corners[:, 0]]) edges0 array([[[1, 2, 3], [7, 8, 9]],
[[4, 5, 6], [4, 5, 6]]])
edges1 array([[[4, 5, 6], [4, 5, 6]],
[[7, 8, 9], [1, 2, 3]]])
edges2 array([[[7, 8, 9], [1, 2, 3]],
[[1, 2, 3], [7, 8, 9]]])
edges = numpy.concatenate((edges0, edges1, edges2), axis = 0) edges array([[[1, 2, 3], [7, 8, 9]],
[[4, 5, 6], [4, 5, 6]], [[4, 5, 6], [4, 5, 6]], [[7, 8, 9], [1, 2, 3]], [[7, 8, 9], [1, 2, 3]], [[1, 2, 3], [7, 8, 9]]]) This should be as intended. The indices are [flat edge number, edge side (left or right), xyz]. Now I guess you have to iterate over all pairs of them, don't know a numpy accelerated method. Maybe it's even faster to draw the edges twice than to invest O(N_edges ** 2) complexity for comparing?
I may be able to do something with lines 14-17 myself--but I'm not sure. Ok, than that's fine, and let us all know about your solution.
It may seem a bit complicated, but I hope this impression is mainly because of the many outputs ... I double-checked everything, *hope* everything is correct. So far from me, Friedrich
Firstly, I want to thank you for all the time and attention you've obviously put into this code. On Tue, Mar 2, 2010 at 12:27 AM, Friedrich Romstedt < friedrichromstedt@gmail.com> wrote:
The loop I can replace by numpy operations:
v_array array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) n_array array([[ 0.1, 0.2, 0.3], [ 0.4, 0.5, 0.6]]) f_list array([[0, 1, 2, 0], [2, 1, 0, 1]])
Retrieving the v1 vectors:
v1s = v_array[f_list[:, 0]] v1s array([[1, 2, 3], [7, 8, 9]])
Retrieving the normal vectors:
ns = n_array[f_list[:, 3]] ns array([[ 0.1, 0.2, 0.3], [ 0.4, 0.5, 0.6]])
I don't think you understand quite what I'm looking for here. Every vec4 in f_list describes a triangle. The first, second, and third are indices of vertices in v_array. The fourth is an index of n_array. From your code, I've learned that I can do this, which is more what I want: v1s = v_array[f_list[:,0:3]] ns = n_array[f_list[:,3]] Obviously, this changes the arrays quite a bit. With changes, the code doesn't actually crash until the corners = line. Do the following lines that do the dot product and comparison still behave correctly? They should be finding, for each triangle, whether or not the associated normal is less than 90 degrees. The triangle's edges should then be added to an array. See end.
Now how to calculate the pairwise dot product (I suppress the difference of v1 to some_point for now):
inner = numpy.inner(ns, v1s) inner array([[ 1.4, 5. ], [ 3.2, 12.2]])
This calculates *all* pairwise dot products, we have to select the diagonal of this square ndarray:
dotprods = inner[[numpy.arange(0, 2), numpy.arange(0, 2)]] dotprods array([ 1.4, 12.2])
Now we can create a boolean array saying where the dotprod is > 0 (i.e, angle < 90°), and select those triangles:
select = dotprods > 0 select array([ True, True], dtype=bool) selected = f_list[select] selected array([[0, 1, 2, 0], [2, 1, 0, 1]])
This seems like a clever idea.
In this case it's the full list. Now build the triangles corners array:
corners = v_array[selected[:, :3]] corners array([[[1, 2, 3], [4, 5, 6], [7, 8, 9]],
[[7, 8, 9], [4, 5, 6], [1, 2, 3]]])
This has indices [triangle, vertex number (0, 1, or 2), xyz]. And compute the edges (I think you can make use of them):
edges_dist = numpy.asarray([corners[:, 1] - corners[:, 0], corners[:, 2] - corners[:, 0], corners[:, 2] - corners[:, 1]]) edges_dist array([[[ 3, 3, 3], [-3, -3, -3]],
[[ 6, 6, 6], [-6, -6, -6]],
[[ 3, 3, 3], [-3, -3, -3]]])
This has indices [corner number, triangle, xyz]. I think it's easier to compare then "reversed" edges, because then edge[i, j] == -edge[k, l]?
But of course:
edges = numpy.asarray([[corners[:, 0], corners[:, 1]], [corners[:, 1], corners[:, 2]], [corners[:, 2], corners[:, 0]]]) edges array([[[[1, 2, 3], [7, 8, 9]],
[[4, 5, 6], [4, 5, 6]]],
[[[4, 5, 6], [4, 5, 6]],
[[7, 8, 9], [1, 2, 3]]],
[[[7, 8, 9], [1, 2, 3]],
[[1, 2, 3], [7, 8, 9]]]])
This has indices [edge number (0, 1, or 2), corner number in edge (0 or 1), triangle]. But this may not be what you want (not flattened in triangle number). Therefore:
edges0 = numpy.asarray([corners[:, 0], corners[:, 1]]) edges1 = numpy.asarray([corners[:, 1], corners[:, 2]]) edges2 = numpy.asarray([corners[:, 2], corners[:, 0]]) edges0 array([[[1, 2, 3], [7, 8, 9]],
[[4, 5, 6], [4, 5, 6]]])
edges1 array([[[4, 5, 6], [4, 5, 6]],
[[7, 8, 9], [1, 2, 3]]])
edges2 array([[[7, 8, 9], [1, 2, 3]],
[[1, 2, 3], [7, 8, 9]]])
edges = numpy.concatenate((edges0, edges1, edges2), axis = 0) edges array([[[1, 2, 3], [7, 8, 9]],
[[4, 5, 6], [4, 5, 6]],
[[4, 5, 6], [4, 5, 6]],
[[7, 8, 9], [1, 2, 3]],
[[7, 8, 9], [1, 2, 3]],
[[1, 2, 3], [7, 8, 9]]])
This should be as intended. The indices are [flat edge number, edge side (left or right), xyz].
Now I guess you have to iterate over all pairs of them, don't know a numpy accelerated method. Maybe it's even faster to draw the edges twice than to invest O(N_edges ** 2) complexity for comparing?
Unfortunately, no. The whole point of the algorithm is to extrude back-facing triangles (those with normals facing away from the light) backward, leaving polygons behind in a light volume. Although a shadow volume tutorial can explain this in a more detailed way, basically, for every triangle, if it is back-facing, add its edges to a list. Remove the duplicate edges. So, the edge between two back-facing triangles-that-meet-along-an-edge is not kept. However, if a back-facing triangle and a front-facing triangle meet, only the back-facing triangle's edge is present in the list, and so it is not removed. Thus, only the border edges between the front-facing triangles and the back facing triangles remain in the list. As front-facing triangles face the light and back-facing triangles don't, a silhouette edge is built up. When these edges are extruded, they're extremely useful. The following image http://www.gamedev.net/reference/articles/1873/image010.gif shows the process. The four triangles are all back facing. If duplicate edges are removed, only edges I0-I2, I2-I4, I4-I3, and I3-I0 remain--the silhouette edges. Still to do, remove the duplicate edges (actually where a good deal of the optimization lies too). So, for every back-facing triangle [v1,v2,v3,n], (where v*n* is a vertex * index*), the edges [v1,v2], [v2,v3], and [v3,v1] need to be added to a list. I.e., f_list needs to be converted into a list of edges in this way. Then, duplicate edge pairs need to be removed, noting that [v1,v2] and [v2,v1] are still a pair (in my Python code, I simply sorted the edges before removing duplicates: [123,4] -> [4,123] and [56,968] -> [56,968]). The final edge list then should be converted back into actual vertices by indexing it into v_array (I think I understand how to do this now!): [ [1,2], [4,6], ... ] -> [ [[0.3,1.6,4.5],[9.1,4.7,7.7]], [[0.4,5.5,8.3],[9.6,8.1,0.3]], ... ]
It may seem a bit complicated, but I hope this impression is mainly because of the many outputs ...
I double-checked everything, *hope* everything is correct.
So far from me, Friedrich
Once again, thanks so much, Ian
Do you have doublets in the v_array? In case not, then you owe me a donut. See attachment. Friedrich P.S.: You misunderstood too, the line you wanted to change was in context to detect back-facing triangles, and there one vertex is sufficient.
Cool--this works perfectly now :-) Unfortunately, it's actually slower :P Most of the slowest part is in the removing doubles section. Some of the costliest calls: #takes 0.04 seconds inner = np.inner(ns, v1s - some_point) #0.0840001106262 sum_1 = sum.reshape((len(sum), 1)).repeat(len(sum), axis = 1) #0.0329999923706 sum_2 = sum.reshape((1, len(sum))).repeat(len(sum), axis = 0) #0.0269999504089 comparison_sum = (sum_1 == sum_2) #0.0909998416901 diff_1 = diff.reshape((len(diff), 1)).repeat(len(diff), axis = 1) #0.0340001583099 diff_2 = diff.reshape((1, len(diff))).repeat(len(diff), axis = 0) #0.0269999504089 comparison_diff = (diff_1 == diff_2) #0.0230000019073 same_edges = comparison_sum * comparison_diff #0.128999948502 doublet_count = same_edges.sum(axis = 0) Ian
2010/3/5 Ian Mallett <geometrian@gmail.com>:
Cool--this works perfectly now :-)
:-)
Unfortunately, it's actually slower :P Most of the slowest part is in the removing doubles section.
Hmm. Let's see ... Can you tell me how I can test the time calls in a script take? I have no idea.
#takes 0.04 seconds inner = np.inner(ns, v1s - some_point)
I think I can do nothing about that at the moment.
#0.0840001106262 sum_1 = sum.reshape((len(sum), 1)).repeat(len(sum), axis = 1)
#0.0329999923706 sum_2 = sum.reshape((1, len(sum))).repeat(len(sum), axis = 0)
#0.0269999504089 comparison_sum = (sum_1 == sum_2)
We can leave out the repeat() calls and leave only the reshape() calls there. Numpy will substitute dimi == 1 dimensions with stride == 0, i.e., it will effectively repeat those dimension, just as we did it explicitly.
#0.0909998416901 diff_1 = diff.reshape((len(diff), 1)).repeat(len(diff), axis = 1)
#0.0340001583099 diff_2 = diff.reshape((1, len(diff))).repeat(len(diff), axis = 0)
#0.0269999504089 comparison_diff = (diff_1 == diff_2)
Same here. Delete the repeat() calls, but not the reshape() calls.
#0.0230000019073 same_edges = comparison_sum * comparison_diff
Hmm, maybe use numpy.logical_and(comparison_sum, comparison_diff)? I don't know, but I guess it is in some way optimised for such things.
#0.128999948502 doublet_count = same_edges.sum(axis = 0)
Maybe try axis = 1 instead. I wonder why this is so slow. Or maybe it's because he does the conversion to ints on-the-fly, so maybe try same_edges.astype(numpy.int8).sum(axis = 0). Hope this gives some improvement. I attach the modified version. Ah, one thing to mention, have you not accidentally timed also the printout functions? They should be pretty slow. Friedrich
Hi, On Sat, Mar 6, 2010 at 1:20 AM, Friedrich Romstedt < friedrichromstedt@gmail.com> wrote:
Hmm. Let's see ... Can you tell me how I can test the time calls in a script take? I have no idea.
I've been doing: t1 = time.time() #code t2 = time.time() print "[code]:",t2-t1
#0.0840001106262 sum_1 = sum.reshape((len(sum), 1)).repeat(len(sum), axis = 1)
#0.0329999923706 sum_2 = sum.reshape((1, len(sum))).repeat(len(sum), axis = 0)
#0.0269999504089 comparison_sum = (sum_1 == sum_2)
We can leave out the repeat() calls and leave only the reshape() calls there. Numpy will substitute dimi == 1 dimensions with stride == 0, i.e., it will effectively repeat those dimension, just as we did it explicitly.
Wow! Drops to immeasurably quick :D
#0.0909998416901 diff_1 = diff.reshape((len(diff), 1)).repeat(len(diff), axis = 1)
#0.0340001583099 diff_2 = diff.reshape((1, len(diff))).repeat(len(diff), axis = 0)
#0.0269999504089 comparison_diff = (diff_1 == diff_2)
Same here. Delete the repeat() calls, but not the reshape() calls.
Once again, drops to immeasurably quick :D
#0.0230000019073 same_edges = comparison_sum * comparison_diff
Hmm, maybe use numpy.logical_and(comparison_sum, comparison_diff)? I don't know, but I guess it is in some way optimised for such things.
It's marginally faster.
#0.128999948502 doublet_count = same_edges.sum(axis = 0)
Maybe try axis = 1 instead. I wonder why this is so slow. Or maybe it's because he does the conversion to ints on-the-fly, so maybe try same_edges.astype(numpy.int8).sum(axis = 0).
Actually, it's marginally slower :S
Hope this gives some improvement. I attach the modified version.
Ah, one thing to mention, have you not accidentally timed also the printout functions? They should be pretty slow.
Nope--I've been timing as above.
Friedrich
On Sat, Mar 6, 2010 at 1:42 AM, Friedrich Romstedt < friedrichromstedt@gmail.com> wrote:
2010/3/5 Ian Mallett <geometrian@gmail.com>:
#takes 0.04 seconds inner = np.inner(ns, v1s - some_point)
Ok, I don't know why I was able to overlook this:
dotprod = (ns * (v1s - some_point)).sum(axis = 1)
Much faster :D So, the most costly lines: comparison_sum = (sum_1 == sum_2) #0.024 sec comparison_diff = (diff_1 == diff_2) #0.024 sec same_edges = np.logical_and(comparison_sum, comparison_diff) #0.029 sec doublet_count = same_edges.sum(axis = 0) #0.147 Thanks again, Ian
2010/3/6 Ian Mallett <geometrian@gmail.com>:
On Sat, Mar 6, 2010 at 1:20 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Hmm. Let's see ... Can you tell me how I can test the time calls in a script take? I have no idea.
I've been doing: t1 = time.time() #code t2 = time.time() print "[code]:",t2-t1
Ok. I just wonder how you can measure those times. On my five-years old machine, they are < 10ms resolution.
Wow! Drops to immeasurably quick :D
Yeaah!
#0.128999948502 doublet_count = same_edges.sum(axis = 0)
Maybe try axis = 1 instead. I wonder why this is so slow. Or maybe it's because he does the conversion to ints on-the-fly, so maybe try same_edges.astype(numpy.int8).sum(axis = 0).
Actually, it's marginally slower :S
Hmm. I tried the axis = 1 thing, and it also gave no improvement (maybe you can try too, I'm guessing I'm actually measuring the time Python spends in exeuting my loop to get significant times ...)
2010/3/5 Ian Mallett <geometrian@gmail.com>:
#takes 0.04 seconds inner = np.inner(ns, v1s - some_point)
Ok, I don't know why I was able to overlook this:
dotprod = (ns * (v1s - some_point)).sum(axis = 1)
Much faster :D
:-) I'm glad to be able to help!
So, the most costly lines: comparison_sum = (sum_1 == sum_2) #0.024 sec comparison_diff = (diff_1 == diff_2) #0.024 sec same_edges = np.logical_and(comparison_sum, comparison_diff) #0.029 sec doublet_count = same_edges.sum(axis = 0) #0.147
At the moment, I can do nothing about that. Seems that we have reached the limit. Anyhow, is it now faster than your Python list implementation, and if yes, how much? How large was your gain by using numpy means at all? I'm just curious. Friedrich
On Sat, Mar 6, 2010 at 12:03 PM, Friedrich Romstedt < friedrichromstedt@gmail.com> wrote:
At the moment, I can do nothing about that. Seems that we have reached the limit. Anyhow, is it now faster than your Python list implementation, and if yes, how much? How large was your gain by using numpy means at all? I'm just curious.
Unfortunately, the pure Python implementation is actually an order of magnitude faster. The fastest solution right now is to use numpy for the transformations, then convert it back into a list (.tolist()) and use Python for the rest. Here's the actual Python code. def glLibInternal_edges(object,lightpos): edge_set = set([]) edges = {} for sublist in xrange(object.number_of_lists): #There's only one sublist here face_data = object.light_volume_face_data[sublist] for indices in face_data: #v1,v2,v3,n normal = object.transformed_normals[sublist][indices[3]] v1,v2,v3 = [ object.transformed_vertices[sublist][indices[i]] for i in xrange(3) ] if abs_angle_between_rad(normal,vec_subt(v1,lightpos))<pi_over_two: for p1,p2 in [[indices[0],indices[1]], [indices[1],indices[2]], [indices[2],indices[0]]]: edge = [p1,p2] index = 0 edge2 = list(edge) edge2.sort() edge2 = tuple(edge2) if edge2 in edges: edges[edge2][1] += 1 else: edges[edge2] = [edge,1] edges2 = [] for edge_data in edges.values(): if edge_data[1] == 1: p1 = object.transformed_vertices[sublist][edge_data[0][0]] p2 = object.transformed_vertices[sublist][edge_data[0][1]] edges2.append([p1,p2]) return edges2 Friedrich
Thanks, Ian
I'm a bit unhappy with your code, because it's so hard to read tbh. You don't like objects? 2010/3/6 Ian Mallett <geometrian@gmail.com>:
Unfortunately, the pure Python implementation is actually an order of magnitude faster. The fastest solution right now is to use numpy for the transformations, then convert it back into a list (.tolist()) and use Python for the rest.
:-( But well, if it's faster, than do it that way, right? I can only guess, that for large datasets, the comparison to compare each and every vector "in parallel" makes it slow down. So an iterative approach might be fine. In fact, no one would implement such things in C++ using not lists with pushback(), I guess. A bit of commentary about your code:
Here's the actual Python code.
def glLibInternal_edges(object,lightpos): edge_set = set([])
Where do you use edge_set? Btw, set() would do it.
edges = {} for sublist in xrange(object.number_of_lists): #There's only one sublist here face_data = object.light_volume_face_data[sublist] for indices in face_data: #v1,v2,v3,n
Here objects would fit in nicely. for indices in face_data: normal = object.transformed_normals[sublist][indices.nindex] (v1, v2, v3) = [object.transformed_vertices[sublist][vidx] for vidx in indices.vindices]
normal = object.transformed_normals[sublist][indices[3]] v1,v2,v3 = [ object.transformed_vertices[sublist][indices[i]] for i in xrange(3) ] if abs_angle_between_rad(normal,vec_subt(v1,lightpos))<pi_over_two:
v1, lightpos can be stored as numpy ndarrays, no? If you have a list of vectors in an ndarray arr, you can convert to a list of ndarrays by using list(arr). This will only iterate over the first dimension, and not recurse into the vectors in arr. Provided this, you can simply write: if (normal * (v1 - lightpos) / numpy.sqrt(((v1 - lightpos) ** 2).sum())) > 0: (...)
for p1,p2 in [[indices[0],indices[1]], [indices[1],indices[2]], [indices[2],indices[0]]]: edge = [p1,p2]
Why not writing: for edge in numpy.asarray([[indices[0], indices[1]], (...)]): (...)
index = 0
Where do you use index? It's a lonely remnant?
edge2 = list(edge)
Why do you convert the unmodified edge list into a list again? Btw, I found your numbering quite a bit annoying. No one can tell from an appended 2 what purpose that xxx2 has. Furthermore, I think a slight speedup could be reached by: unique = (egde.sum(), abs((egde * [1, -1]).sum()))
edge2.sort() edge2 = tuple(edge2) if edge2 in edges: edges[edge2][1] += 1 else: edges[edge2] = [edge,1]
edges2 = [] for edge_data in edges.values(): if edge_data[1] == 1: p1 = object.transformed_vertices[sublist][edge_data[0][0]] p2 = object.transformed_vertices[sublist][edge_data[0][1]] edges2.append([p1,p2]) return edges2
My 2 cents: class Edge: def __init__(self, indices, count): self.indices = indices def __hash__(self): return hash(self.unique) edges = {} (...) edges.setdefault() edges[egde2] += 1 for (indices, count) in edges.items(): if count == 1: edges_clean.append(object.transformed_vertices[sublist][indices]) provided that transformed_vertices[sublist] is an ndarray. You can iterate over ndarrays using standard Python iteration, it will provide an iterator object.
"I'm a bit unhappy with your code, because it's so hard to read tbh. You don't like objects?" I would phrase this differently now: I think you can improve your code by using objects when they are appropriate. :-) 2010/3/6 Ian Mallett <geometrian@gmail.com>:
Unfortunately, the pure Python implementation is actually an order of magnitude faster. The fastest solution right now is to use numpy for the transformations, then convert it back into a list (.tolist()) and use Python for the rest.
:-( But well, if it's faster, than do it that way, right? I can only guess, that for large datasets, the comparison to compare each and every vector "in parallel" makes it slow down. So an iterative approach might be fine. In fact, no one would implement such things in C++ using not lists with pushback(), I guess. A bit of commentary about your code:
Here's the actual Python code.
def glLibInternal_edges(object,lightpos): edge_set = set([])
Where do you use edge_set? Btw, set() would do it.
edges = {} for sublist in xrange(object.number_of_lists): #There's only one sublist here face_data = object.light_volume_face_data[sublist] for indices in face_data: #v1,v2,v3,n
Here objects would fit in nicely. for indices in face_data: normal = object.transformed_normals[sublist][indices.nindex] (v1, v2, v3) = [object.transformed_vertices[sublist][vidx] for vidx in indices.vindices]
normal = object.transformed_normals[sublist][indices[3]] v1,v2,v3 = [ object.transformed_vertices[sublist][indices[i]] for i in xrange(3) ] if abs_angle_between_rad(normal,vec_subt(v1,lightpos))<pi_over_two:
v1, lightpos can be stored as numpy ndarrays, no? If you have a list of vectors in an ndarray arr, you can convert to a list of ndarrays by using list(arr). This will only iterate over the first dimension, and not recurse into the vectors in arr. Provided this, you can simply write: if (normal * (v1 - lightpos) / numpy.sqrt(((v1 - lightpos) ** 2).sum())) > 0: (...)
for p1,p2 in [[indices[0],indices[1]], [indices[1],indices[2]], [indices[2],indices[0]]]: edge = [p1,p2]
Why not writing: for edge in numpy.asarray([[indices[0], indices[1]], (...)]): (...)
index = 0
Where do you use index? It's a lonely remnant?
edge2 = list(edge)
Why do you convert the unmodified edge list into a list again? Btw, I found your numbering quite a bit annoying. No one can tell from an appended 2 what purpose that xxx2 has. Furthermore, I think a slight speedup could be reached by: unique = (egde.sum(), abs((egde * [1, -1]).sum()))
edge2.sort() edge2 = tuple(edge2) if edge2 in edges: edges[edge2][1] += 1 else: edges[edge2] = [edge,1]
edges2 = [] for edge_data in edges.values(): if edge_data[1] == 1: p1 = object.transformed_vertices[sublist][edge_data[0][0]] p2 = object.transformed_vertices[sublist][edge_data[0][1]] edges2.append([p1,p2]) return edges2
My 2 cents: class CountingEdge: def __init__(self, indices): self.indices = indices self.count = 0 edges = {} (...) edges.setdefault(unique, CountingEdge(edge)) edges[unique].count += 1 for counting_edge in edges.values(): if counting_edge.count == 1: edges_clean.append(object.transformed_vertices[sublist][counting_edge.indices]) provided that transformed_vertices[sublist] is an ndarray. You can iterate over ndarrays using standard Python iteration, it will provide an iterator object. I'm always good for fresh ideas, although they nearly never get accepted, but here is another cent: I want to save you calculating the mutual annihilation for several light sources again and again. BEFORE SHADING 1. For the mesh, calculate all edges. Put their unique tuples in a list. Store together with the unique tuple from what triangle they originate and what the original indices were. When you use the sorted indices as unique value, the original indices will be virtually identical with the unique tuple. class EdgeOfTriangle: def __init__(self, unique, vindices, triangleindex): self.unique = unique self.vindices = vindices self.triangleindex = triangleindex def __hash__(self): # For the "in" statement return hash(self.unique) def __eq__(self, other): return other == self.triangle edges = [] for triangle_descriptor in triangle_descriptors: (...) egde1 = EdgeOfTriangle(unique, vindices1, triangle_descriptor.nindex) edge2 = EdgeOfTriangle(unique, vindices2, traingle_descriptor.nindex) (and so on) for new_egde in (edge1, egde2, egde3): if new_edge not in edges: edges.append(new_edge) edges = numpy.asarray(egdes) 2. Make an ndarray matrix from numpy.zeros(len(edges), len(triangles)). Iterate through the list of EdgeOfTriangle objects created in step (1.) and put an offset +1 in the elements [edge_of_triangle.index, edge_of_triangle.triangle], ..., [egde_of_triangle.edge3, edge_of_triangle.triange]. Now, this matrix tells you how often *all* edges will be selected given the back-facing triangles. The matrix shall be called impact_matrix. Virtually do this by: edges_reshaped = edges.reshape((len(edges), 1)) triangles = numpy.arange(0, len(n_array)).reshape((1, len(n_array))) # Employs __eq__ above in EdgeOfTriangle: impact_matrix = (egdes_reshaped == triangles) DURING SHADING 3. Calculate the dot product with my old method. It gives you a list of floats. 4. Calculate selected = numpy.dot(impact_matrix, dotproducts > 0). It gives you precisely the number of times each edge is selected. selected = (numpy.dot(impact_matrix, dotproducts > 0) == 1) 5. Calculate edge_array[selected == 1], when edge_array is the array holding *all* edges occuring (built during step (1.). It will give you the vertex indices of the edges comprising the silhuette. def getvindices(edge_of_triangle): return edge_of_triangle.vindices ugetvindices = numpy.vectorize(getvindices) silhuette_edges_of_triangles = egdes[selected] silhuette_vindices = numpy.asarray(ugetvindices(silhuette_edges_of_triangles)) silhuette_vertices = v_array[silhuette_vindices] Checked everything only *once to twice* this time, fwiw, Friedrich
2010/3/5 Ian Mallett <geometrian@gmail.com>:
#takes 0.04 seconds inner = np.inner(ns, v1s - some_point)
Ok, I don't know why I was able to overlook this: dotprod = (ns * (v1s - some_point)).sum(axis = 1) The things with the inner product have been deleted. Now I will really *attach* it ... Hope it's faster, Friedrich
participants (5)
-
Charles R Harris
-
Chris Colbert
-
Friedrich Romstedt
-
Ian Mallett
-
josef.pktd@gmail.com