64 bit offsets?
jayryan.thompson at gmail.com
Wed Oct 6 23:41:29 CEST 2010
I'm trying to extract some data from a large memory mapped file (the largest
is ~30GB) with re.finditer() and re.start(). Pythons regular expression
module is great but the size of re.start() is 32bits (signed so I can really
only address 2GB). I was wondering if any here had some suggestions on how
to get the long offsets I need. btw... I can't break up the file because the
pattern I'm looking for can occur anywhere and on any boundry.
Also, is seek() limited to 32bit addresses?
this is what I have in python 2.7 AMD64:
with open(file_path, 'r+b') as file:
file_map = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
pattern = re.compile("pattern")
for iii in re.finditer(pattern, file_map):
offset = iii.start()
"It's quite difficult to remind people that all this stuff was here for a
million years before people. So the idea that we are required to manage it
is ridiculous. What we are having to manage is us." ...Bill Ballantine,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list