Finding a text in raw data(size nearly 10GB) and Printing its memory address using python
Chris Angelico
rosuav at gmail.com
Mon Apr 23 13:31:14 EDT 2018
On Tue, Apr 24, 2018 at 3:24 AM, Hac4u <samakshkaushik at gmail.com> wrote:
> I have a raw data of size nearly 10GB. I would like to find a text string and print the memory address at which it is stored.
>
> This is my code
>
> import os
> import re
> filename="filename.dmp"
> read_data=2**24
> searchtext="bd:mongo:"
> he=searchtext.encode('hex')
Why encode it as hex?
> with open(filename, 'rb') as f:
> while True:
> data= f.read(read_data)
> if not data:
> break
> elif searchtext in data:
> print "Found"
> try:
> offset=hex(data.index(searchtext))
> print offset
> except ValueError:
> print 'Not Found'
> else:
> continue
You have a loop that reads a slab of data from a file, then searches
the current data only. Then you search that again for the actual
index, and print it - but you're printing the offset within the
current chunk only. You'll need to maintain a chunk position in order
to get the actual offset.
Also, you're not going to find this if it spans across a chunk
boundary. May need to cope with that.
ChrisA
More information about the Python-list
mailing list