finding/replacing a long binary pattern in a .bin file

Bengt Richter bokr at oz.net
Thu Jan 13 19:55:08 EST 2005


On Thu, 13 Jan 2005 11:40:52 -0800, Jeff Shannon <jeff at ccvcorp.com> wrote:

>Bengt Richter wrote:
>
>> BTW, I'm sure you could write a generator that would take a file name
>> and oldbinstring and newbinstring as arguments, and read and yield nice
>> os-file-system-friendly disk-sector-multiple chunks, so you could write
>> 
>>     fout = open('mynewbinfile', 'wb')
>>     for buf in updated_file_stream('myoldbinfile','rb', oldbinstring, newbinstring):
>>         fout.write(buf)
>>     fout.close()
>
>What happens when the bytes to be replaced are broken across a block 
>boundary?  ISTM that neither half would be recognized....
That was part of the exercise ;-)

(Hint: use str.find to find unbroken oldbinstrings in current inputbuffer and buffer out
 safe changes, then when find fails, delete the safely used front of the input buffer,
 and append another chunk from the input file. Repeat until last chunk has been appended
 and find finds no more. Then buffer out the tail of the input buffer (if any) that then
 won't have an oldbinstring to change).

>
>I believe that this requires either reading the entire file into 
>memory, to scan all at once, or else conditionally matching an 
>arbitrary fragment of the end of a block against the beginning of the 
>oldbinstring...  Given that the file in question is only a few tens of 
>kbytes, I'd think that doing it in one gulp is simpler.  (For a large 
>file, chunking it might be necessary, though...)

It's certainly simpler to do it in one gulp, but it's not really hard to
do it in chunks. You just have to make sure your input buffer/chunksize is/are
larger than oldbinstring ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list