Searching a binary file

luthi at vaw.baug.ethz.NOSPAM.ch luthi at vaw.baug.ethz.NOSPAM.ch
Mon Jul 24 12:39:08 EDT 2000


What is the fastest way to search a binary file for a certain byte pattern
(**** in my case)?

I came up with this solution, but I guess there is a better way to do
it. Thanks for all suggestions.

Martin Luethi

=====

# constants
MaxFileSizeInMemory = 10*1000*1000

file = open('my_binary_file', 'b')

file.seek(0,2)                   # set the pointer to the end of the file
filesize = file.tell()           # get the size of the file
file.seek(0)                     # reset the file pointer
rex = re.compile(r'[\*]{4}')     # the bytes to search for
starpointer = []                 # position of the '****' in the file
oldpos = 0
for i in range(filesize/MaxFileSizeInMemory + 1):
    data = self.file.read(MaxFileSizeInMemory) # read a chunk of bytes
    m = rex.search(data)         
    while m:                     # find all '****' in this chunk 
        pos = m.start() - 4      # corrected for the length of '****'
        incpointer.append(pos + i*MaxFileSizeInMemory)
        m = rex.search(data, pos + 10)
        oldpos = pos
file.close()		         # close the file

-- 
============================================================
Martin Luethi			Tel. +41 1 632 40 92
VAW ETH Zuerich			
CH-8092 Zuerich			mail luthi at vaw.baum.ethz.ch
Switzerland
============================================================



More information about the Python-list mailing list