Hi,<br>I would suggest you to use the biopython package. It has a PDB parser with which you can extract any specific information like atom name, residue, chain etc as you wish.<br>Bala<br><br><div class="gmail_quote">On Wed, May 9, 2012 at 3:19 AM, Jerry Hill <span dir="ltr">&lt;<a href="mailto:malaclypse2@gmail.com" target="_blank">malaclypse2@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On Tue, May 8, 2012 at 4:00 PM, Spyros Charonis &lt;<a href="mailto:s.charonis@gmail.com">s.charonis@gmail.com</a>&gt; wrote:<br>


</div><div class="im">&gt; Hello python community,<br>

&gt;<br>

&gt; I&#39;m having a small issue with list indexing. I am extracting certain<br>

&gt; information from a PDB (protein information) file and need certain fields of<br>

&gt; the file to be copied into a list. The entries look like this:<br>

&gt;<br>

&gt; ATOM   1512  N   VAL A 222       8.544  -7.133  25.697  1.00 48.89<br>

&gt; N<br>

&gt; ATOM   1513  CA  VAL A 222       8.251  -6.190  24.619  1.00 48.64<br>

&gt; C<br>

&gt; ATOM   1514  C   VAL A 222       9.528  -5.762  23.898  1.00 48.32<br>

&gt; C<br>

&gt;<br>

&gt; I am using the following syntax to parse these lines into a list:<br>

</div>...<br>

<div class="im">&gt; charged_res_coord.append(atom_coord[i].split()[1:9])<br>

<br>

</div>You&#39;re using split, assuming that there will be blank spaces between<br>

your fields.  That&#39;s not true, though.  PDB is a fixed length record<br>

format, according to the documentation I found here:<br>

<a href="http://www.wwpdb.org/docs.html" target="_blank">http://www.wwpdb.org/docs.html</a><br>

<br>

If you just have a couple of items to pull out, you can just slice the<br>

string at the appropriate places.  Based on those docs, you could pull<br>

the x, y, and z coordinates out like this:<br>

<br>

<br>

x_coord = atom_line[30:38]<br>

y_coord = atom_line[38:46]<br>

z_coord = atom_line[46:54]<br>

<br>

If you need to pull more of the data out, or you may want to reuse<br>

this code in the future, it might be worth actually parsing the record<br>

into all its parts.  For a fixed length record, I usually do something<br>

like this:<br>

<br>

pdbdata = &quot;&quot;&quot;<br>

<div class="im">ATOM   1512  N   VAL A 222       8.544  -7.133  25.697  1.00 48.89           N<br>

ATOM   1513  CA  VAL A 222       8.251  -6.190  24.619  1.00 48.64           C<br>

ATOM   1514  C   VAL A 222       9.528  -5.762  23.898  1.00 48.32           C<br>

</div><div class="im">ATOM   1617  N   GLU A1005      11.906  -2.722   7.994  1.00 44.02           N<br>

</div>&quot;&quot;&quot;.splitlines()<br>

<br>

atom_field_spec = [<br>

    slice(0,6),<br>

    slice(6,11),<br>

    slice(12,16),<br>

    slice(16,18),<br>

    slice(17,20),<br>

    slice(21,22),<br>

    slice(22,26),<br>

    slice(26,27),<br>

    slice(30,38),<br>

    slice(38,46),<br>

    slice(46,54),<br>

    slice(54,60),<br>

    slice(60,66),<br>

    slice(76,78),<br>

    slice(78,80),<br>

    ]<br>

<br>

for line in pdbdata:<br>

    if line.startswith(&#39;ATOM&#39;):<br>

        data = [line[field_spec] for field_spec in atom_field_spec]<br>

        print(data)<br>

<br>

<br>

You can build all kind of fancy data structures on top of that if you<br>

want to.  You could use that extracted data to build a namedtuple for<br>

convenient access to the data by names instead of indexes into a list,<br>

or to create instances of a custom class with whatever functionality<br>

you need.<br>

<span class="HOEnZb"><font color="#888888"><br>

--<br>

Jerry<br>

</font></span><div class="HOEnZb"><div class="h5">_______________________________________________<br>

Tutor maillist  -  <a href="mailto:Tutor@python.org">Tutor@python.org</a><br>

To unsubscribe or change subscription options:<br>

<a href="http://mail.python.org/mailman/listinfo/tutor" target="_blank">http://mail.python.org/mailman/listinfo/tutor</a><br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br>C. Balasubramanian<br><br>