[Tutor] Euclidean Distances between Atoms in a Molecule.

Peter Otten __peter__ at web.de
Mon Apr 3 05:36:10 EDT 2017


Stephen P. Molnar wrote:

> I am trying to port a program that I wrote in FORTRAN twenty years ago
> into Python 3 and am having a hard time trying to calculate the
> Euclidean distance between each atom in the molecule and every other
> atom in the molecule.
> 
> Here is a typical table of coordinates:
> 
> 
>        MASS         X         Y         Z
> 0   12.011 -3.265636  0.198894  0.090858
> 1   12.011 -1.307161  1.522212  1.003463
> 2   12.011  1.213336  0.948208 -0.033373
> 3   14.007  3.238650  1.041523  1.301322
> 4   12.011 -5.954489  0.650878  0.803379
> 5   12.011  5.654476  0.480066  0.013757
> 6   12.011  6.372043  2.731713 -1.662411
> 7   12.011  7.655753  0.168393  2.096802
> 8   12.011  5.563051 -1.990203 -1.511875
> 9    1.008 -2.939469 -1.327967 -1.247635
> 10   1.008 -1.460475  2.993912  2.415410
> 11   1.008  1.218042  0.451815 -2.057439
> 12   1.008 -6.255901  2.575035  1.496984
> 13   1.008 -6.560562 -0.695722  2.248982
> 14   1.008 -7.152500  0.390758 -0.864115
> 15   1.008  4.959548  3.061356 -3.139100
> 16   1.008  8.197613  2.429073 -2.588339
> 17   1.008  6.503322  4.471092 -0.543939
> 18   1.008  7.845274  1.892126  3.227577
> 19   1.008  9.512371 -0.273198  1.291080
> 20   1.008  7.147039 -1.365346  3.393778
> 21   1.008  4.191488 -1.928466 -3.057804
> 22   1.008  5.061650 -3.595015 -0.302810
> 23   1.008  7.402586 -2.392148 -2.374554
> 
> What I need for further calculation is a matrix of the Euclidean
> distances between the atoms.
> 
> So far in searching the Python literature I have only managed to confuse
> myself and would greatly appreciate any pointers towards a solution.
> 
> Thanks in advance.
> 

Stitched together with heavy use of a search engine:

$ cat data.txt
       MASS         X         Y         Z
0   12.011 -3.265636  0.198894  0.090858
1   12.011 -1.307161  1.522212  1.003463
2   12.011  1.213336  0.948208 -0.033373
3   14.007  3.238650  1.041523  1.301322
4   12.011 -5.954489  0.650878  0.803379
5   12.011  5.654476  0.480066  0.013757
6   12.011  6.372043  2.731713 -1.662411
7   12.011  7.655753  0.168393  2.096802
8   12.011  5.563051 -1.990203 -1.511875
9    1.008 -2.939469 -1.327967 -1.247635
10   1.008 -1.460475  2.993912  2.415410
11   1.008  1.218042  0.451815 -2.057439
12   1.008 -6.255901  2.575035  1.496984
13   1.008 -6.560562 -0.695722  2.248982
14   1.008 -7.152500  0.390758 -0.864115
15   1.008  4.959548  3.061356 -3.139100
16   1.008  8.197613  2.429073 -2.588339
17   1.008  6.503322  4.471092 -0.543939
18   1.008  7.845274  1.892126  3.227577
19   1.008  9.512371 -0.273198  1.291080
20   1.008  7.147039 -1.365346  3.393778
21   1.008  4.191488 -1.928466 -3.057804
22   1.008  5.061650 -3.595015 -0.302810
23   1.008  7.402586 -2.392148 -2.374554
$ python3
Python 3.4.3 (default, Nov 17 2016, 01:08:31) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy, pandas, scipy.spatial.distance as dist
>>> df = pandas.read_table("data.txt", sep=" ", skipinitialspace=True)      
>>> a = numpy.array(df[["X", "Y", "Z"]])
>>> dist.squareform(dist.pdist(a, "euclidean"))
<snip big matrix>

Here's an example with just the first 4 atoms:

>>> dist.squareform(dist.pdist(a[:4], "euclidean"))
array([[ 0.        ,  2.53370139,  4.54291701,  6.6694065 ],
       [ 2.53370139,  0.        ,  2.78521357,  4.58084922],
       [ 4.54291701,  2.78521357,  0.        ,  2.42734737],
       [ 6.6694065 ,  4.58084922,  2.42734737,  0.        ]])

See 
https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html
There may be a way to do this with pandas.pivot_table(), but I didn't manage 
to find that.

As Alan says, this is not the appropriate forum for the topic you are 
belabouring. 

Work your way through a Python tutorial to pick up the basics (we can help 
you with this), then go straight to where the (numpy/scipy) experts are.



More information about the Tutor mailing list