It&#39;s also extremely surprising to me that reading a 1.8MB file is causing a memory error.  That&#39;s actually not a particularly large file, and if it is causing a memory error, there must be something wrong with the your Python configuration or build.<div>

<br></div><div>Best,</div><div>Lucas<br><br><div class="gmail_quote">On Tue, May 17, 2011 at 12:26 PM, Lucas Wiman <span dir="ltr">&lt;<a href="mailto:lucas.wiman@gmail.com">lucas.wiman@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><br><br><div class="gmail_quote">On Tue, May 17, 2011 at 10:56 AM,  <span dir="ltr">&lt;<a href="mailto:baypiggies-request@python.org" target="_blank">baypiggies-request@python.org</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

I wish to read a large data file (file size is around 1.8 MB) and manipulate<br>

the data in this file. Just reading and writing the first 500 lines of this<br>

file is causing a problem. I wrote:<br>

<br>

fin = open(&#39;gene-GS00471-DNA_B01_1101_37-ASM.tsv&#39;)<br>

count = 0<br>

for i in fin.readlines():<br>

    print i<br>

    count += 1<br>

    if count &gt;= 500:<br>

        break<br>

<br>

and got this error msg:<br>

<br>

Traceback (most recent call last):<br>

  File<br>

&quot;H:\genome_4_omics_study\GS000003696-DID\GS00471-DNA_B01_1101_37-ASM\GS00471-DNA_B01\ASM\gene-GS00471-DNA_B01_1101_37-ASM.tsv\test.py&quot;,<br>

line 3, in &lt;module&gt;<br>

    for i in fin.readlines():<br>

MemoryError<br></blockquote><div><br></div><div>If your data is actually a tsv (tab-separated value format), you should be using the csv module for actually iterating over lines in it.  Just set the delimiter to &#39;\t&#39; and look at the docs at <a href="http://docs.python.org/library/csv.html" target="_blank">http://docs.python.org/library/csv.html</a></div>


<div><br></div><div>You should also generally use the &quot;with&quot; syntax when dealing with files since it handles closing the file object for you (probably not an issue when you&#39;re just reading from a single file, but best practices nonetheless).  Here&#39;s how I would deal with your situation:</div>


<div><br></div><div><div><font face="&#39;courier new&#39;, monospace">import csv</font></div><div><font face="&#39;courier new&#39;, monospace"><br></font></div><div><font face="&#39;courier new&#39;, monospace">with open(&#39;gene-GS00471-DNA_B01_1101_37-ASM.tsv&#39;, &#39;r&#39;) as f:</font></div>


<div><font face="&#39;courier new&#39;, monospace">    r = csv.reader(f, delimiter=&#39;\t&#39;)</font></div><div><font face="&#39;courier new&#39;, monospace">    for row in r:</font></div>

<div><font face="&#39;courier new&#39;, monospace">        # row is a list of strings that correspond to the columns in your file</font></div><div><font face="&#39;courier new&#39;, monospace">        do_stuff_with_the_row(row)</font></div>


<div><font face="&#39;courier new&#39;, monospace"># your file object f is now closed</font></div></div><div><font face="&#39;courier new&#39;, monospace"><br></font></div>

<div><font face="arial, helvetica, sans-serif">Best wishes,</font></div><div><font face="arial, helvetica, sans-serif">Lucas Wiman</font></div></div>

</blockquote></div><br></div>