[Tutor] How to read all integers from a binary file?

Martin A. Brown martin at linux-ip.net
Fri Oct 9 21:26:09 CEST 2015


Hello there David,

> I have a binary file of 32-bit unsigned integers. I want to read 
> all those integers into a list.
>
> I have:
>
>    ulData = []
>    with open(ulBinFileName, "rb") as inf:
>        ulData.append(struct.unpack('i', inf.read(4)))
>
> This obviously reads only one integer from the file.  How would I 
> modify this code to read all integers from the file please?

I see you already have responses from Laura Creighton and Peter 
Otten.  I particularly like the idea of using array.array() to read 
the data.

You can also use a number in the format string to indicate how many 
times you want to apply the format string.

For example, assume your data file was exactly 20 bytes long and has 
been read into memory as a variable called binary_data.  Now you 
want to make a list containing 5 of these unsigned integers.  Well, 
you can do this:

   l = struct.unpack('5i', binary_data)

   # -- that would give you this tuple (fabricated data)
   #
   # l = (-1682686882, 2011801612, -366700247, 463482871, 1463075426)

Below, you can see what I did to generalize this technique in the 
function read_data().

If, however, you do not have complex data, then array.array() is 
probably the easiest path for you.

Good luck and have fun!

-Martin

from __future__ import print_function

import sys
import struct


def write_data(f, size, datatype):
     data = gen_data(size)
     fmt = str(len(data)) + datatype
     s = struct.pack(fmt, *data)
     f.write(s)
     f.close()


def gen_data(size):
     import numpy
     data = list(numpy.random.randint(-(2**31)+1, 2**31, size=size))
     return data


def read_data(f, datatype):
     s = f.read()
     f.close()
     element_size = struct.calcsize(datatype)
     q, r = divmod(len(s), element_size)
     assert r == 0
     fmt = str(int(len(s) / element_size)) + 'i'
     data = struct.unpack(fmt, s)
     return data


if __name__ == '__main__':
     if len(sys.argv) > 2:
         size = int(sys.argv[2])
         write_data(open(sys.argv[1], 'wb'), size, 'i')
     data = read_data(open(sys.argv[1], 'rb'), 'i')
     print(len(data))



P.S.  You'll see that I didn't have a mess of unsigned integers
   hanging around in a file, so you can see how I generated and
   stored them in write_data and gen_data).

-- 
Martin A. Brown
http://linux-ip.net/


More information about the Tutor mailing list