efficient text file search.
steve at holdenweb.com
Mon Sep 11 17:05:31 CEST 2006
> Bill Scherer wrote:
>>>Is there a more efficient method to find a string in a text file then:
>>>for line in f:
>>> if 'string' in line:
>>> print 'FOUND'
>>>does "for line in f: " read a block of line to te memory or is it
>>>simply calls f.readline() many times?
>>If your file is sorted by some key in the data, you can build a very
>>fast binary search with mmap in Python.
> can you add some more info, or point me to a link, i haven't found
> anything about binary search in mmap() in python documents.
> the files are very big...
[please don't "top-post": add your latest comments at the end so the
story reads from the beginning].
I think this is probably not going to help you. A binary search is only
useful if you want to locate a value in an ordered list. Since your
original posting made it seem like the text you are looking for could
appear in any position in any line of the file a binary search doesn't
do you any good at all (in fact it complicates things and slows them
down unnecessarily) because you'd still need to look at all lines.
Plus, if the lines are of variable length then you'd need to start by
creating an index of them, meaning you'd have to go right through the
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden
More information about the Python-list