[Tutor] Extract Block of Data from a 2D Array

Karim kliateni at gmail.com
Fri Mar 31 10:34:09 EDT 2017



On 31/03/2017 14:19, Stephen P. Molnar wrote:

> I have a block of data extracted from a quantum mechanical calculation:
>
> CARTESIAN COORDINATES (A.U.)
> ----------------------------
>   NO LB      ZA    FRAG     MASS         X           Y           Z
>    0 C     6.0000    0    12.011   -3.265636    0.198894 0.090858
>    1 C     6.0000    0    12.011   -1.307161    1.522212 1.003463
>    2 C     6.0000    0    12.011    1.213336    0.948208 -0.033373
>    3 N     7.0000    0    14.007    3.238650    1.041523 1.301322
>    4 C     6.0000    0    12.011   -5.954489    0.650878 0.803379
>    5 C     6.0000    0    12.011    5.654476    0.480066 0.013757
>
> where the number of lines depends upon the molecule being considered.
>
> I want to extract the block of data starting on line 4 and column 4. 
> Unfortunately, the only programming language in which I used to be 
> competent in is Fortran.  I have attempted researching this problem, 
> but have only succeeded in increasing my mental entropy.
>
> Help will be much appreciated.  Thanks in advance. starting on line 4 
> and column 4. Unfortunately, the only programming language in which I 
> used to be competent in is Fortran.  I have attempted researching this 
> problem, but have only succeeded

Hello,

If your delimiter is a tabulation in your data tab you can use the CSV 
module in this way using python 2.7 (code not tested):

-----------------------------------------------------------------
$ python
Python 2.7.6 (default, Oct 26 2016, 20:30:19)
[GCC 4.8.4] on linux2
 >>>
 >>> import csv

 >>> my_data_file = open( "my_tab_file")

 >>> reader = csv.DictReader(my_data_file, delimiter='\t')

 >>> for line_number, row in enumerate(reader, start=1):
               if line_number >= 4:
                   print row['MASS'],

and in a file called extract_data.py:

#--------------------------------------------------------------------

import csv

my_data_file = open( "my_tab_file")
reader = csv.DictReader(my_data_file, delimiter='\t')

for line_number, row in enumerate(reader, start=1):
       if line_number >= 4:
           print row['MASS'],

#----------------------------------------------------------------------

Just execute:
$ python extract_data.py

Just be aware that the first line of you data file is recognized as the 
header of all columns.
The csv module allows you to to easily reference the element of each 
column with its column
name (given by the header == the first line of the file)

Cheers
Karim



More information about the Tutor mailing list