Read header and data from a binary file [LONG]

Jose Rafael Pacheco jose_rafael_pacheco at yahoo.es
Tue Sep 22 17:18:16 EDT 2009


Hello,

I want to read from a binary file called myaudio.dat
Then I've tried the next code:

import struct
name = "myaudio.dat"
f = open(name,'rb')
f.seek(0)
chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I"
s = f.read(4*1+4*1+4*1+4*1+4*1+20*1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1)
a = struct.unpack(chain, s)
header = {'identifier'     : a[0],
          'cid'              : a[1],
          'clength'       : a[2],
                  'hident'         : a[3],
                  'hcid32'         : a[4],
                  'hdate'          : a[5],
                  'sampling'     : a[6],
                  'length_B'      : a[7],
                  'max_cA'       : a[8],
                  'max_cA1'     : a[9],
                  'identNOTE'  : a[10],
                  'c2len'          : a[11],}

It produces:

{'length_B': 150001, 'sampling': 50000, 'max_cA1': 'NOTE', 'hident': 'HEDR',
'c2len': "Normal Sustained Vowel 'A', Voice and Speech Lab., MEEI, Boston,
MA", 'hdate': 'Jul 13 11:57:41 1994', 'identNOTE': 68, 'max_cA': -44076,
'cid': 'DS16', 'hcid32': 32, 'identifier': 'FORM', 'clength': 300126}

So far when I run f.tell()
>>f.tell()
136L

The audio data length is 300126, now I need a clue to build an array with
the audio data (The Chunk SDA_), would it possible with struct?, any help ?

Thanks

The file format is:


Offset  |  Length |  Type |                Contents
0               4            character     Identifier: "FORM"
4              4            character     Chunk identifier: "DS16"
8              4            integer             Chunk length
12              -             -                      Chunk data

Header 2

Offset       Length       Type       Contents
0     4     character     Identifier: "HEDR" or "HDR8"
4     4     integer     Chunk length (32)
8     20     character     Date, e.g. "May 26 23:57:43 1995"
28     4     integer     Sampling rate
32     4     integer     Data length (bytes)
36     2     unsigned integer     Maximum absolute value for channel A:
0xFFFF if not defined
38     2     unsigned integer     Maximum absolute value for channel A:
0xFFFF if not defined

NOTE Chunk

Offset       Length       Type       Contents
0     4     character     Identifier: "NOTE"
4     4     integer     Chunk length
8     -     character     Comment string

SDA_, SD_A or SDAB Chunk
Offset     Length     Type     Contents
0     4     character     Identifier: "SDA_", "SD_B", or "SDAB"
4     4     integer     Chunk length
8     -     -     Data
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090923/bdcb1770/attachment.html>


More information about the Python-list mailing list