Re: [Numpy-discussion] Is there anyway to read raw binary file via pytable?
2010/7/28 Ken Watford email@example.com:
2010/7/28 脑关生命科学仪器 firstname.lastname@example.org:
it seems like pytable only support HDF5. I had some 500GB numerical arrays to process. Pytable claims to have some advance feature to enhance processing speed and largely reduce physical memory requirement. However, I do not wanna touch the raw data I had. Simply because I do not have doubled diskspace to covert all 10TB data into HDF5. Is there any way to let pytable read raw binary files or alternatively to package raw files into HDF5 format without change the files themselves.?
HDF5 does support datasets with an external contiguous binary file as the storage area. For documentation on this, see: http://www.hdfgroup.org/HDF5/doc/UG/10_Datasets.html#Allocation
Thanks a lot! I do found 3 API functions to handle this problem: H5P have 3 function to handle external raw data file. H5Pset_external http://davis.lbl.gov/Manuals/HDF5-1.6.1/RM_H5P.html#Property-SetExternal H5Pget_external_count http://davis.lbl.gov/Manuals/HDF5-1.6.1/RM_H5P.html#Property-GetExternalCoun... H5Pget_external http://davis.lbl.gov/Manuals/HDF5-1.6.1/RM_H5P.html#Property-GetExternal
However, right now h5py does not provide these functions, though existing datasets created with external storage should work fine.
I need to use C, fortran, or matlab to generate datasets with external data. So far it works fine.