It isn't a standardized format. It is the output of a Fortran hydrodynamic circulation model called SELFE. The output files are fortran binaries. I could probably cycle through the files and convert them to netcdf one by one with a python script but it would be quicker and more space efficient if I could directly use the original outputs. thanks, - dharhas
"Sebastian Haase" <haase@msg.ucsf.edu> 8/7/2008 11:00 AM >>> On Thu, Aug 7, 2008 at 3:19 PM, Dharhas Pothina <Dharhas.Pothina@twdb.state.tx.us> wrote: Hi,
I've been following the thread on 'partially reading a file' with some interest and have a related question.
So I have a series of large binary data files (1_data.dat, 2_data.dat, etc) that represent a 3D time series of data. Right now I am cycling through all the files reading the entire dataset to memory and extracting the subset I need. This works but is extremely memory hungry and slow and I'm running out of memory for datasets more than a year long. I could calculate which few files contain the data I need and only read those in but that is a bit cumbersome and also doesn't help if I need a 1d or 2d slice of the whole time period.
In the other thread Travis gave an example of using memmap to map a file to memory. Can I do this to with multiple files. ie use memmap to generate an array[x,y,z,t] that I can then use slicing to actually read what I need? Another complication is that each binary file has a header section and then a data section. By reading the first file I can calculate the offset for the data part of the file.
Hi dharhas yes, you can do all these things, I'm doing this for 3d and 4d images files. What file format are you interested in ? I use MRC files ... Cheers, Sebastian Haase _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user