
On 9/12/11 4:38 PM, Christopher Jordan-Squire wrote:
I did some timings to see what the advantage would be, in the simplest case possible, of taking multiple lines from the file to process at a time.
Nice work, only a minor comment:
f6 and f7 use stripped down versions of Chris Barker's accumulator idea. The difference is that f6 uses resize when expanding the array while f7 uses np.empty followed by np.append. This avoids the penalty from copying data that np.resize imposes.
I don't think it does:
""" In [3]: np.append? ---------- arr : array_like Values are appended to a copy of this array.
Returns ------- out : ndarray A copy of `arr` with `values` appended to `axis`. Note that `append` does not occur in-place: a new array is allocated and filled. """
There is no getting around the copying. However, I think resize() uses the OS memory re-allocate call, which may, in some instances, have over-allocated the memory in the first place, and thus not require a copy. So I'm pretty sure ndarray.resize is as good as it gets.
f6 : 3.26ms f7 : 2.77ms (Apparently it's a lot cheaper to do np.empty followed by append then do to resize)
Darn that profiling proving my expectations wrong again! though I'm really confused as to how that could be!
-Chris