search speed

Scott David Daniels Scott.Daniels at Acm.Org
Fri Jan 30 17:58:16 CET 2009


Tim Rowe wrote:
> .... But even without going to a full database solution it might
> be possible to make use of the flat file structure. For example, does
> the "LF01" have to appear at a specific position in the input line? If
> so, there's no need to search for it in the complete line. *If* there
> is any such structure then a compiled regexp search is likely to be
> faster than just 'if "LF01" in line', and (provided it's properly
> designed) provides a bit of extra insurance against false positives.

Clearly this is someone who regularly uses grep or perl.  If you
know the structure, like the position in a line, something like
the following should be fast:

     with open(somename) as source:
          for n, line in enumerate(source):
              if n % 5 == 3 and line[5 : 9] == 'LF01':
                  print ('Found on line %s: %s' % (1 + n, line.rstrip())

Be careful with your assertion that a regex is faster, it is certainly
not always true.  Measure speed, don't take mantras as gospel.

--Scott David Daniels
Scott.Daniels at Acm.Org



More information about the Python-list mailing list