Recursive insertion of a line
Francesco Pietra
chiendarret at yahoo.com
Tue Nov 20 04:16:53 EST 2007
Please, see below.
--- Gabriel Genellina <gagsl-py2 at yahoo.com.ar> wrote:
> En Mon, 19 Nov 2007 21:15:16 -0300, Henry <henry.robinson at gmail.com>
> escribió:
>
> > On 19/11/2007, Francesco Pietra <chiendarret at yahoo.com> wrote:
> >>
> >> How to insert "TER" records recursively, i.e. some thousand fold, in a
> >> file
> >> like in the following example? "H2 WAT" is the only constant
> >> characteristic of
> >> the line after which to insert "TER"; that distinguishes also for lines
> >
> > If every molecule is water, and therefore 3 atoms,
> you can use this fact
> > to
> > insert TER in the right place. You don't need recursion:
> >
> > f = open( "atoms.txt", "rt" )
> > lineCount = 0
> > for line in f.xreadlines( ):
> > lineCount = lineCount + 1
> > print line
> > if lineCount == 3:
> > lineCount = 0
> > print "TER"
> > f.close( )
>
> A small variation can handle the original, more generic condition "insert
> TER after the line containing H2
> WAT"
>
> f = open("atoms.txt", "r")
> for line in f:
> print line
> if "H2 WAT" in line:
> print "TER"
> f.close()
>
> (also, note that unless you're using Python 2.2 or earlier, the xreadlines
> call does no good)
I tried the latter script (which works also if there are other molecules in the
file, as it is my case) encountering two problems:
(1) "TER" records were inserted, as seen on the shell window. Though, the file
on disk was not modified. Your script named "ter_insert.py", in order to get
the modified file I used the classic
$ python ter_insert.py 2>&1 | tee file.out
Now, "file .out" had "TER" inserted where I wanted. It might well be that it
was my incorrect use of your script.
(2) An extra line is inserted (which was not a problem of outputting the file
as I did), except between "TER" and the next line, as shown below:
TER
ATOM 27400 O WAT 4178 20.289 4.598 26.491 1.00 0.00 W20 O
ATOM 27401 H1 WAT 4178 19.714 3.835 26.423 1.00 0.00 W20 H
ATOM 27402 H2 WAT 4178 21.173 4.237 26.554 1.00 0.00 W20 H
TER
ATOM 27403 O WAT 4585 23.340 3.428 25.621 1.00 0.00 W20 O
ATOM 27404 H1 WAT 4585 22.491 2.985 25.602 1.00 0.00 W20 H
ATOM 27405 H2 WAT 4585 23.826 2.999 26.325 1.00 0.00 W20 H
TER
ATOM 27406 O WAT 4966 22.359 0.555 27.001 1.00 0.00 W20 O
ATOM 27407 H1 WAT 4966 21.820 1.202 27.456 1.00 0.00 W20 H
ATOM 27408 H2 WAT 4966 22.554 -0.112 27.659 1.00 0.00 W20 H
TER
END
Where "END" is how Protein Data Bank (pdb) files end. As these files are
extremely sensitive, can the script be modified to avoid these extra lines? Not
tried (it takes time, because I have to go to the big cluster) if the extra
lines really create problems, though, they take a lot of space on the shell
window.
A nearly perfect script. Thank you
francesco
>
> --
> Gabriel Genellina
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs
More information about the Python-list
mailing list