shoehorn c-structured data into Numpy

Helmut Fritz spam.helmut.hard at gmail.com
Sun Jun 14 17:42:23 CEST 2009


Hello there everyone, I used to be on this a long time ago but then I got so
much spam I gave up.

But this strategy has come a little unstuck.  I have binary output from a
Fortran program that is in a big-endian C-structured binary file.  The
output can be very variable and many options create different orderings in
the binary file. So I'd like to keep the header-reading in python.

Anyhoo, I've so far been able to read the output with the struct module.
But my question is how do I create numpy arrays from the bits of the file I
want?

So far I've been able to scan through to the relevant sections and I've
tried all maner of idiotic combinations...

The floats are 4 bytes for sinngle percision, and it's a unstructured grid
from a finite difference scheme so I know the number of cells (ncells) for
the property I am looking to extract.

So I've tried:
TC1 = np.frombuffer(struct.unpack(">%df" % ncells, data.read(4*ncells))[0],
dtype=float)
Only to get a very logical:
>>> Traceback (most recent call last):
>>>   File "a2o.py", line 466, in <module>
>>>     runme(me)
>>>   File "a2o.py", line 438, in runme
>>>     me.spmapdat(data)
>>>   File "a2o.py", line 239, in spmapdat
>>>     TC1 = np.frombuffer(struct.unpack(">%df" % ncells,
data.read(4*ncells))[0], dtype=float)
>>> AttributeError: 'float' object has no attribute '__buffer__'

ok... so I'll feed frombuffer my data file...

And then tried:
TC1 = np.frombuffer(data.read(4*ncells), dtype=float, count=ncells)
>>> Traceback (most recent call last):
>>>   File "a2o.py", line 466, in <module>
>>>     runme(me)
>>>   File "a2o.py", line 438, in runme
>>>     me.spmapdat(data)
>>>   File "a2o.py", line 240, in spmapdat
>>>     TC1 = np.frombuffer(data.read(4*ncells), dtype=float, count=ncells)
>>> ValueError: buffer is smaller than requested size

And THEN I tried:
TC1 = np.frombuffer(data.read(4*ncells), dtype=float, count=4*ncells)
>>> Traceback (most recent call last):
>>>   File "a2o.py", line 466, in <module>
>>>     runme(me)
>>>   File "a2o.py", line 438, in runme
>>>     me.spmapdat(data)
>>>   File "a2o.py", line 240, in spmapdat
>>>     TC1 = np.frombuffer(data.read(4*ncells), dtype=float,
count=4*ncells)
>>> ValueError: buffer is smaller than requested size

But it's the right size - honest.

(In general) I should be able to put these arrays into memory with no
problems.  Certainly given the rate at which I'm turning around this code...
Memory may be in the terabytes once I'm done.

Anyone got a Sesame Street answer for this?

Many thanks!  Helmut.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090614/3949eea8/attachment.html>


More information about the Python-list mailing list