Hello all,

 

I am finding that directly packing numpy arrays into binary using the tostring and fromstring methods do not provide a speed improvement over writing the same arrays to ascii files.  Obviously, the size of the resulting files is far smaller, but I was hoping to get an improvement in the speed of writing.  I got that speed improvement using the struct module directly, or by using generic python arrays.  Let me further describe my methodological issue as it may directly relate to any solution you might have.

 

My output file is heterogeneous.  Each line is either an array of integers or floats.  Each record is made up of three entries.

They serve as a sparse representation of a large matrix.

 

1)       row, n (both integers)

2)       array of integers of length n, representing columns

3)       array of floats of length n, representing values

 

 

Here, “n” is not constant across the records, so many of the database structures I have looked at do not apply.  Any suggestions would be greatly appreciated.

 

 

Mark Janikas