Hello all,
I am finding that directly packing numpy arrays into binary using
the tostring and fromstring methods do not provide a speed improvement over
writing the same arrays to ascii files. Obviously, the size of the
resulting files is far smaller, but I was hoping to get an improvement in the
speed of writing. I got that speed improvement using the struct module
directly, or by using generic python arrays. Let me further describe my
methodological issue as it may directly relate to any solution you might have.
My output file is heterogeneous. Each line is either
an array of integers or floats. Each record is made up of three entries.
They serve as a sparse representation of a large matrix.
1) row, n (both
integers)
2) array of integers
of length n, representing columns
3) array of
floats of length n, representing values
Here, “n” is not constant across the records, so
many of the database structures I have looked at do not apply. Any suggestions
would be greatly appreciated.
Mark Janikas