[Numpy-discussion] reading *big* inhomogenous text matrices *fast*?
Zachary Pincus
zachary.pincus at yale.edu
Wed Aug 13 22:11:07 EDT 2008
> This is similar to what I tried originally! Unfortunately, repeatedly
> appending to a list seems to be very slow... I guess Python keeps
> reallocating and copying the list as it grows. (It would be nice to
> be
> able to tune the increments by which the list size increases.)
Robert's right, as ever -- repeated appending to a list is an
*extremely* common operation, which you see often in idiomatic python.
The implementation of list.append should be very fast, and smart about
pre-allocating as needed.
Try profiling the code just to make sure that it is the list append
that's slow, and not something else happening on that line, e.g..
> I hope this recipe may prove useful to others. It would be nice if
> NumPy
> had a built-in facility for arrays that intelligently expend their
> allocation as they grow.
It appears to be the general consensus on this mailing list that the
best solution when an expandable array is required is to append to a
python list, and then once you've built it up completely, convert it
to an array. So I'm at least surprised that this is turning out to be
so slow for you... But if the profiler says that's where the trouble
is, then so it is...
>> Also you could see if:
>> cells[type].append(numpy.array([index, property]+nodes, dtype=int))
>>
>> is faster than what's above... it's worth testing.
>
> Repeatedly concatenating arrays with numpy.append or
> numpy.concatenate is
> also quite slow, unfortunately. :-(
Actually, my suggestion was to compare building up a list-of-lists and
then converting that to a 2d array versus building up a list-of-
arrays, and then converting that to a 2d array... one might wind up
being faster or more memory-efficient than the other...
Zach
More information about the NumPy-Discussion
mailing list