Thanks to all for their responses. A correction: my data file is 1.8 GB and not 1.8 MB as Simeon pointed out.<br><br><div class="gmail_quote">On Tue, May 17, 2011 at 1:41 PM, Simeon Franklin <span dir="ltr"><<a href="mailto:simeonf@gmail.com">simeonf@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">On Tue, May 17, 2011 at 10:17 AM, Vikram K <<a href="mailto:kpguy1975@gmail.com">kpguy1975@gmail.com</a>> wrote:<br>
> I wish to read a large data file (file size is around 1.8 MB) and manipulate<br>
> the data in this file. Just reading and writing the first 500 lines of this<br>
> file is causing a problem. I wrote:<br>
><br>
> fin = open('gene-GS00471-DNA_B01_1101_37-ASM.tsv')<br>
> count = 0<br>
> for i in fin.readlines():<br>
> print i<br>
> count += 1<br>
> if count >= 500:<br>
> break<br>
<br>
</div>You don't need the readlines call - the file object itself supports<br>
iteration over lines; readlines() is there is you specifically want to<br>
create a list containing all the lines in the file. Try it with<br>
<br>
for i in fin:<br>
<br>
instead of<br>
<div class="im"><br>
for i in fin.readlines():<br>
<br>
</div>and see... Were you mistaken above and is the filesize 1.8 GB instead<br>
of MB? You shouldn't be having memory errors with 1.8MB given a normal<br>
environment. If you are working with multi-gigabyte files, however,<br>
you should read David Beazley's awesome Generator Tricks paper<br>
(<a href="http://www.dabeaz.com/generators-uk/" target="_blank">http://www.dabeaz.com/generators-uk/</a>). I re-read it on a regular<br>
basis and always pick up something new...<br>
<br>
-regards<br>
<font color="#888888">Simeon Franklin<br>
</font></blockquote></div><br>