Question about reading a big binary file and write it into several text (ascii) files

Bengt Richter bokr at oz.net
Mon Jan 24 22:56:38 EST 2005


On 24 Jan 2005 12:44:32 -0800, "Albert Tu" <sjtu.0992 at gmail.com> wrote:

>Hi,
>
>I am learning and pretty new to Python and I hope your guys can give me
>a quick start.
>
>I have an about 1G-byte binary file from a flat panel x-ray detector; I
>know at the beggining there is a 128-byte header and the rest of the
>file is integers in 2-byte format.
It looks like 16-bit pixels in the 1024*768 images, I assume
>
>What I want to do is to save the binary data into several smaller files
>in integer format and each smaller file has the size of 2*1024*768
>bytes.
You could do that, but why duplicate so much data that you may never look at?
E.g., why not a class that provides a view of your big file in terms of an image index
and returns an efficient array in memory e.g., (untested)

    import array
    def getimage(n, f, offset=128):
        f.seek(offset+n*2*1024*768)
        return array('H', f.read(2*1024*768)) # 'H' is for unsigned 2-byte integers (check endianness for swap need!)

Then usage would be
    imfile = open('big_file.bin', 'rb')
    imarray = getimage(23, imfile)
And you could get pixel x,y by
    xpix, ypix = imarray[x+y*1024]  # or maybe x*768+y etc.

or your could make getimage a method of a class that you intialize with
the file and which could maintain an lru cache of images
with a particular disk directory as backup, etc. etc. and would provide
images wrapped with nice methods to support whatever you are doing with the images.


>
>I know I can do something like
>>>>f=open("xray.seq", 'rb')
>>>>header=f.read(128)
>>>>file1=f.read(2*1024*768)
>>>>file2=f.read(2*1024*768)
>>>>......
>>>>f.close()
>
>Bur I don't them how to save files in integer format (converting from
>binary to ascii files) and how to do this in an elegant and snappy way.
Best is probably to leave the original format alone, e.g., (untested and needs try/except)
this should split the big file into individual image files named file0.ximg .. filen.ximg

    f = open('xray.seq/, 'rb')
    header = f.read(128)
    nfile = 0
    while 1:
        im = f.read(2*1024*768)
        if not im: break
        if len(im) != 2*1024*768: print 'broken tail of %s bytes'%len(im); break
        fw = open('file%s.ximg' % nfile, 'wb')
        fw.write(im)
        fw.close()
        nfile +=1

then you could use getimage above with offset passed as 0 and image number 0, e.g.,

 im23 = getimage(0, open('file23.ximg','rb'), 0) # img 0, offset 0

But then you might wonder about all those separate files, unless you want to
put them on a series of CDs where they wouldn't all fit on one. Whatever ;-)

You will probably lose in both speed and space if you try to make some kind
of ascii disk files. You aren't thinking XML are you??!! For this, definitely ick ;-)

>
What you want to do will depend on the big picture, which is not apparent yet ;-)
>
>Please reply when you guyes can get a chance.
>Thanks,

Sorry to give nothing but untested suggestion, but I have to go, and I
will be off line mostly for a while.

Regards,
Bengt Richter



More information about the Python-list mailing list