Cannot test right now, but np.unique(b, return_inverse=True)[1].reshape(2, -1) should do what you are after, I think.
Hi,
I have run into a potential 'for loop' bottleneck. Let me outline:
The following array describes bonds (connections) in a benzene molecule
b = [[0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 7,
8, 9, 10, 11],
[5, 6, 1, 0, 2, 7, 3, 8, 1, 4, 9, 2, 10, 5, 3, 4, 11, 0, 0, 1,
2, 3, 4, 5]]
ie. bond 0 connects atoms 0 and 5, bond 1 connects atom 0 and 6, etc. In
practical examples, the list can be much larger (N > 100.000 connections.
Suppose atoms with indices a = [1,2,3,7,8] are deleted, then all bonds
connecting those atoms must be deleted. I achieve this doing
i_0 = numpy.in1d(b[0], a)
i_1 = numpy.in1d(b[1], a)
b_i = numpy.where(i_0 | i_1)[0]
b = b[:,~(i_0 | i_1)]
If you find this approach lacking, feel free to comment.
This results in the following updated bond list
b = [[0, 0, 4, 4, 5, 5, 5, 6, 10, 11]
[5, 6, 10, 5, 4, 11, 0, 0, 4, 5]]
This list is however not correct: Since atoms [1,2,3,7,8] have been
deleted, the remaining atoms with indices larger than the deleted atoms
must be decremented. I do this as follows:
for i in a:
b = numpy.where(b > i, bonds-1, bonds) (*)
yielding the correct result
b = [[0, 0, 1, 1, 2, 2, 2, 3, 5, 6],
[2, 3, 5, 2, 1, 6, 0, 0, 1, 2]]
The Python for loop in (*) may easily contain 50.000 iteration. Is there
a smart way to utilize numpy functionality to avoid this?
Thanks and best regards,
Mads
--
+---------------------------------------------------------+
| Mads Ipsen |
+----------------------+----------------------------------+
| Gåsebæksvej 7, 4. tv | phone: +45-29716388 |
| DK-2500 Valby | email: mads.ipsen@gmail.com |
| Denmark | map : www.tinyurl.com/ns52fpa |
+----------------------+----------------------------------+
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion