[Numpy-discussion] Is there anyway to read raw binary file via pytable?

Robert Kern robert.kern at gmail.com
Wed Jul 28 16:40:50 EDT 2010


2010/7/28 Ken Watford <kwatford+scipy at gmail.com>:
> 2010/7/28 脑关生命科学仪器 <braingateway at gmail.com>:
>> it seems like pytable only support HDF5. I had some 500GB numerical arrays
>> to process. Pytable claims to have some advance feature to enhance
>> processing speed and largely reduce physical memory requirement. However, I
>> do not wanna touch the raw data I had. Simply because I do not have doubled
>> diskspace to covert all 10TB data into HDF5. Is there any way to let pytable
>> read raw binary files or alternatively to package raw files into HDF5 format
>> without change the files themselves.?
>>
>> Thanks
>>
>> Brain Gateway
>
> HDF5 does support datasets with an external contiguous binary file as
> the storage area. For documentation on this, see:
> http://www.hdfgroup.org/HDF5/doc/UG/10_Datasets.html#Allocation

This will not give him any of the speedups of PyTables he seeks. Those
optimizations come from the compression of the data on disk, not the
use of the HDF5 library.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list