[Numpy-discussion] record data previous to Numpy use

Chris Barker chris.barker at noaa.gov
Thu Jul 6 19:59:12 EDT 2017


On Thu, Jul 6, 2017 at 10:55 AM, <paul.carrico at free.fr> wrote:
>
> It's is just a reflexion, but for huge files one solution might be to
> split/write/build first the array in a dedicated file (2x o(n) iterations -
> one to identify the blocks size - additional one to get and write), and
> then to load it in memory and work with numpy -
>

I may have your use case confused, but if you have a huge file with
multiple "blocks" in it, there shouldn't be any problem with loading it in
one go -- start at the top of the file and load one block at a time
(accumulating in a list) -- then you only have the memory overhead issues
for one block at a time, should be no problem.

at this stage the dimension is known and some packages will be fast and
> more adapted (pandas or astropy as suggested).
>
pandas at least is designed to read variations of CSV files, not sure you
could use the optimized part to read an array out of part of an open file
from a particular point or not.

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170706/89740cef/attachment.html>


More information about the NumPy-Discussion mailing list