python reading file memory cost

Dan Stromberg drsalists at gmail.com
Tue Aug 2 01:01:46 EDT 2011


You could try forcing a garbage collection...

On Mon, Aug 1, 2011 at 8:22 PM, Tony Zhang <warriorlance at gmail.com> wrote:

> Thanks!
>
> Actually, I used .readline() to parse file line by line, because I need
> to find out the start position to extract data into list, and the end
> point to pause extracting, then repeat until the end of file.
> My file to read is formatted like this:
>
> blabla...useless....
> useless...
>
> /sign/
> data block(e.g. 10 cols x 1000 rows)
> ...
> blank line
> /sign/
> data block(e.g. 10 cols x 1000 rows)
> ...
> blank line
> ...
> ...
> EOF
> let's call this file 'myfile'
> and my python snippet:
>
> f=open('myfile','r')
> blocknum=0 #number the data block
> data=[]
> while True"
>        # find the extract begnning
>        while not f.readline().startswith('/a1/'):pass
>        # creat multidimensional list to store data block
>        data=append([])
>        blocknum +=1
>        line=f.readline()
>
>        while line.strip():
>        # check if the line is a blank line, i.e the end of one block
>                data[blocknum-1].append(["2.6E" %float(x) for x in
> line.split()])
>                line = f.readline()
>        print "Read Block %d" %blocknum
>        if not f.readline(): break
>
> The running result was that read a 500M file consume almost 2GB RAM, I
> cannot figure it out, somebody help!
> Thanks very much!
>
> --Tony
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110801/f5d5aad7/attachment.html>


More information about the Python-list mailing list