Re: [Numpy-discussion] striding through arbitrarily large files
At 12:11 PM 2/5/2014, Richard Hattersley wrote:
On 4 February 2014 15:01, RayS <<mailto:rays@blue-cove.com>rays@blue-cove.com> wrote: I was struggling with methods of reading large disk files into numpy efficiently (not FITS or .npy, just raw files of IEEE floats from numpy.tostring()). When loading arbitrarily large files it would be nice to not bother reading more than the plot can display before zooming in. There apparently are no built in methods that allow skipping/striding...
Since you mentioned the plural "files", are your datasets entirely contained within a single file? If not, you might be interested in Biggus (<https://pypi.python.org/pypi/Biggus>https://pypi.python.org/pypi/Biggus). It's a small pure-Python module that lets you "glue-together" arrays (such as those from smmap) into a single arbitrarily large virtual array. You can then step over the virtual array and it maps it back to the underlying sources.
Richard
ooh, that might help they are individual GB files from medical trial studies I see there are some examples about https://github.com/SciTools/biggus/wiki/Sample-usage http://nbviewer.ipython.org/gist/pelson/6139282 Thanks!
participants (1)
-
RayS