shoehorn c-structured data into Numpy
MRAB
python at mrabarnett.plus.com
Sun Jun 14 13:17:17 EDT 2009
Helmut Fritz wrote:
>
> Hello there everyone, I used to be on this a long time ago but then I
> got so much spam I gave up.
>
> But this strategy has come a little unstuck. I have binary output from
> a Fortran program that is in a big-endian C-structured binary file. The
> output can be very variable and many options create different orderings
> in the binary file. So I'd like to keep the header-reading in python.
>
> Anyhoo, I've so far been able to read the output with the struct
> module. But my question is how do I create numpy arrays from the bits
> of the file I want?
>
> So far I've been able to scan through to the relevant sections and I've
> tried all maner of idiotic combinations...
>
> The floats are 4 bytes for sinngle percision, and it's a unstructured
> grid from a finite difference scheme so I know the number of cells
> (ncells) for the property I am looking to extract.
>
> So I've tried:
> TC1 = np.frombuffer(struct.unpack(">%df" % ncells,
> data.read(4*ncells))[0], dtype=float)
> Only to get a very logical:
> >>> Traceback (most recent call last):
> >>> File "a2o.py", line 466, in <module>
> >>> runme(me)
> >>> File "a2o.py", line 438, in runme
> >>> me.spmapdat(data)
> >>> File "a2o.py", line 239, in spmapdat
> >>> TC1 = np.frombuffer(struct.unpack(">%df" % ncells,
> data.read(4*ncells))[0], dtype=float)
> >>> AttributeError: 'float' object has no attribute '__buffer__'
>
This:
struct.unpack(">%df" % ncells, data.read(4*ncells))
unpacks to a tuple of floats, from which you get the first (actually
the zeroth) float. You probably didn't want to do that! :-)
Try:
TC1 = np.array(struct.unpack(">%df" % ncells, data.read(4 *
ncells)), dtype=float)
> ok... so I'll feed frombuffer my data file...
>
> And then tried:
> TC1 = np.frombuffer(data.read(4*ncells), dtype=float, count=ncells)
> >>> Traceback (most recent call last):
> >>> File "a2o.py", line 466, in <module>
> >>> runme(me)
> >>> File "a2o.py", line 438, in runme
> >>> me.spmapdat(data)
> >>> File "a2o.py", line 240, in spmapdat
> >>> TC1 = np.frombuffer(data.read(4*ncells), dtype=float, count=ncells)
> >>> ValueError: buffer is smaller than requested size
>
> And THEN I tried:
> TC1 = np.frombuffer(data.read(4*ncells), dtype=float, count=4*ncells)
> >>> Traceback (most recent call last):
> >>> File "a2o.py", line 466, in <module>
> >>> runme(me)
> >>> File "a2o.py", line 438, in runme
> >>> me.spmapdat(data)
> >>> File "a2o.py", line 240, in spmapdat
> >>> TC1 = np.frombuffer(data.read(4*ncells), dtype=float,
> count=4*ncells)
> >>> ValueError: buffer is smaller than requested size
>
> But it's the right size - honest.
>
> (In general) I should be able to put these arrays into memory with no
> problems. Certainly given the rate at which I'm turning around this
> code... Memory may be in the terabytes once I'm done.
>
> Anyone got a Sesame Street answer for this?
>
> Many thanks! Helmut.
>
More information about the Python-list
mailing list