With regards to arguments about holding onto large arrays, I would like to emphasize that my original suggestion mentioned weakref'ed numpy arrays. Essentially, the idea is to claw back only the raw memory blocks during that limbo period between discarding the numpy array python object and when python garbage-collects it. Ben Root On Mon, Oct 3, 2016 at 2:43 PM, Julian Taylor <jtaylor.debian@googlemail.com
wrote:
On 03.10.2016 20:23, Chris Barker wrote:
On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor <jtaylor.debian@googlemail.com <mailto:jtaylor.debian@googlemail.com>> wrote:
the problem with this approach is that we don't really want numpy hogging on to hundreds of megabytes of memory by default so it would need to be a user option.
indeed -- but one could set an LRU cache to be very small (few items, not small memory), and then it get used within expressions, but not hold on to much outside of expressions.
numpy doesn't see the whole expression so we can't really do much. (technically we could in 3.5 by using pep 523, but that would be a larger undertaking)
However, is the allocation the only (Or even biggest) source of the performance hit?
on large arrays the allocation is insignificant. What does cost some time is faulting the memory into the process which implies writing zeros into the pages (a page at a time as it is being used). By storing memory blocks in numpy we would save this portion. This is really the job of the libc, but these are usually tuned for general purpose workloads and thus tend to give back memory back to the system much earlier than numerical workloads would like.
Note that numpy already has a small memory block cache but its only used for very small arrays where the allocation cost itself is significant, it is limited to a couple megabytes at most. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion