Python garbage collector/memory manager behaving strangely

Robert Miles robertmiles at teranews.com
Thu Nov 1 01:40:55 CET 2012


On 9/16/2012 9:12 PM, Dave Angel wrote:
> On 09/16/2012 09:07 PM, Jadhav, Alok wrote:
>> Hi Everyone,
>>
>>
>>
>> I have a simple program which reads a large file containing few million
>> rows, parses each row (`numpy array`) and converts into an array of
>> doubles (`python array`) and later writes into an `hdf5 file`. I repeat
>> this loop for multiple days. After reading each file, i delete all the
>> objects and call garbage collector.  When I run the program, First day
>> is parsed without any error but on the second day i get `MemoryError`. I
>> monitored the memory usage of my program, during first day of parsing,
>> memory usage is around **1.5 GB**. When the first day parsing is
>> finished, memory usage goes down to **50 MB**. Now when 2nd day starts
>> and i try to read the lines from the file I get `MemoryError`. Following
>> is the output of the program.

Is it a 32-bit program?  If so, expect the maximum amount of memory it
can use to hold the program, its current dataspace, and images of all
the files it has open to be about 3.5 GB, even if it is running on a
64-bit computer with over 4 GB of memory.  It seems that 32-bit
addresses can only refer to 4 GB of memory, and part of that 4 GB
must be used for whatever the operating system needs for running
32-bit programs.  With some of the older compilers, only 2 GB can be
used for the program; the other 2 GB is reserved for the operating system.

How practical would it be to have that program run twice a day?
The first time, it should ignore all the data for the second half
of the day; the second time, it should ignore all the data for the
first half of the day.



More information about the Python-list mailing list