reading binary data from a 32 bit machine on 64 bit machine

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Thu Feb 19 15:31:25 EST 2009


En Thu, 19 Feb 2009 16:51:39 -0200, harijay <harijay at gmail.com> escribió:

> Hi I am very confused with the use of the struct module to read binary
> data from a file.
> ( I have only worked with ascii files so far)
>
> I have a file spec for a Data-logger (http://www.dataq.com/support/
> techinfo/ff.htm)

That format is rather convoluted -- due to historical reasons, I imagine...

> I am collecting some voltage , time traces on one channel and they are
> written to the binary file on a 32 bit windows machine
>
> The file spec says that the number of header bytes in the data file
> header is stored as a 16 bit  eye "I" at bits 6-7

If it says "at *byte* positions 6-7" you need a seek(6) to start reading  
 from there, not seek(5).

> Now I want to get at that number. When I try format !h I get a
> meaningful number
> If f is my file handle opened with "rb" mode
>
>>>> f.seek(5)
>>>> (Integer,) = struct.unpack('!h',f.read(2))
>>>> (Integer,)
> (9348,)
>
> I am assuming that means that there are 9348 header bytes . Can
> someone look at the format spec and tell me if I am on the right
> track.

Not exactly. Why '!' (network byte order)? The spec doesn't say about byte  
order, but since it's a Windows program we can assume little endian, '<'  
or just '=' (native).
But instead of multiple seeks + micro-reads I'd read the whole header and  
decode it at once (the fixed part is only 110 bytes long):

fixed_header_fmt = struct.Struct("<HHBBhLL...")
f = open(..., 'rb')
fixed_header = f.read(110)
elements = [None]
elements[1:] = fixed_header_fmt.unpack(fixed_header)
# just to keep the 1-based element numbering

Now, elements[4] is the fourth row in the table, "Number of bytes in each  
channel info entry"

The format is built from the Type column: UI -> H, I -> h, B -> B, UL ->  
L, L -> l, D -> d, F -> f.

-- 
Gabriel Genellina




More information about the Python-list mailing list