Read header and data from a binary file

Simon Forman sajmikins at gmail.com
Wed Sep 23 00:14:54 CEST 2009


On Tue, Sep 22, 2009 at 4:30 PM, Jose Rafael Pacheco
<jose_rafael_pacheco at yahoo.es> wrote:
> Hello,
>
> I want to read from a binary file called myaudio.dat
> Then I've tried the next code:
>
> import struct
> name = "myaudio.dat"
> f = open(name,'rb')
> f.seek(0)

Don't bother to seek(0) on a file you just opened.

> chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I"
> s = f.read(4*1+4*1+4*1+4*1+4*1+20*1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1)

Instead of calculating the size of the data represented by the format,
instead use the struct.calcsize() function

s = f.read(struct.calcsize(chain))

> a = struct.unpack(chain, s)
> header = {'identifier'     : a[0],
>           'cid'              : a[1],
>           'clength'       : a[2],
>                   'hident'         : a[3],
>                   'hcid32'         : a[4],
>                   'hdate'          : a[5],
>                   'sampling'     : a[6],
>                   'length_B'      : a[7],
>                   'max_cA'       : a[8],
>                   'max_cA1'     : a[9],
>                   'identNOTE'  : a[10],
>                   'c2len'          : a[11],}
>
> It produces:
>
> {'length_B': 150001, 'sampling': 50000, 'max_cA1': 'NOTE', 'hident': 'HEDR',
> 'c2len': "Normal Sustained Vowel 'A', Voice and Speech Lab., MEEI, Boston,
> MA", 'hdate': 'Jul 13 11:57:41 1994', 'identNOTE': 68, 'max_cA': -44076,
> 'cid': 'DS16', 'hcid32': 32, 'identifier': 'FORM', 'clength': 300126}
>
> So far when I run f.tell()
>>>f.tell()
> 136L
>
> The audio data length is 300126, now I need a clue to build an array with
> the audio data (The Chunk SDA_), would it possible with struct?, any help ?

Read the chunk ID and length and then use the length to read the rest
of the chunk data.



> Thanks
>
> The file format is:
>
>
> Offset  |  Length |  Type |                Contents
> 0               4            character     Identifier: "FORM"
> 4              4            character     Chunk identifier: "DS16"
> 8              4            integer             Chunk length
> 12              -             -                      Chunk data
>
> Header 2
>
> Offset       Length       Type       Contents
> 0     4     character     Identifier: "HEDR" or "HDR8"
> 4     4     integer     Chunk length (32)
> 8     20     character     Date, e.g. "May 26 23:57:43 1995"
> 28     4     integer     Sampling rate
> 32     4     integer     Data length (bytes)
> 36     2     unsigned integer     Maximum absolute value for channel A:
> 0xFFFF if not defined
> 38     2     unsigned integer     Maximum absolute value for channel A:
> 0xFFFF if not defined
>
> NOTE Chunk
>
> Offset       Length       Type       Contents
> 0     4     character     Identifier: "NOTE"
> 4     4     integer     Chunk length
> 8     -     character     Comment string
>
> SDA_, SD_A or SDAB Chunk
> Offset     Length     Type     Contents
> 0     4     character     Identifier: "SDA_", "SD_B", or "SDAB"
> 4     4     integer     Chunk length
> 8     -     -     Data
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>



More information about the Python-list mailing list