[Numpy-discussion] iteration slowing, no increase in memory

Chad Netzer chad.netzer at gmail.com
Thu Sep 10 20:32:28 EDT 2009


On Thu, Sep 10, 2009 at 10:03 AM, John [H2O] <washakie at gmail.com> wrote:

> It runs very well for the first few iterations, but then slows tremendously
> - there is nothing significantly different about the files or directory in
> which it slows. I've monitored the memory use, and it is not increasing.

The memory use itself is not a good indicator, as modern operating
systems (Linux, Windows, Mac, et al) generally use all available free
memory as a disk cache.  So the system memory use may remain quite
steady while old data is flushed and new data paged in.  The first few
iterations could be "fast" if they are already in memory, although the
behavior should probably change on repeated runs.

If you reboot, then immediately run the script, is it slow on all
directories?  Or if you can't reboot, can you at least remount the
filesystem (which should flush all the cached data and metadata)?  Or,
for recent Linux kernels:

http://linux-mm.org/Drop_Caches

Are other operations slow/fast for the different directories, such as
tar'ing them up, or "du -s"?  Can you verify the integrity of the
drive with SMART tools?  If its Linux, can you get data on the actual
disk device I/O (using "iostat" or "vmstat")?

Or you could test by iterating over the same directory repeatedly; it
should be fast after the first iteration.  Then move to a "problem"
directory and see if the first iteration only is slow, or if all
iterations are slow.

-C



More information about the NumPy-Discussion mailing list