efficient 'tail' implementation
Nick Craig-Wood
nick at craig-wood.com
Thu Dec 8 10:30:04 EST 2005
Gerald Klix <Gerald.Klix at klix.ch> wrote:
> As long as memory mapped files are available, the fastest
> method is to map the whole file into memory and use the
> mappings rfind method to search for an end of line.
Actually mmap doesn't appear to have an rfind method :-(
Here is a tested solution using mmap using your code. Inefficient if
number of lines to be tailed is too big.
import os
import sys
import mmap
def main(nlines, filename):
reportFile = open( filename )
length = os.fstat( reportFile.fileno() ).st_size
if length == 0:
# Don't map zero length files, windows will barf
return
try:
mapping = mmap.mmap( reportFile.fileno(), length,
mmap.MAP_PRIVATE, mmap.PROT_READ )
except AttributeError:
mapping = mmap.mmap(
reportFile.fileno(),
0, None,
mmap.ACCESS_READ )
search = 1024
lines = []
while 1:
if search > length:
search = length
tail = mapping[length-search:]
lines = tail.split(os.linesep)
if len(lines) >= nlines or search == length:
break
search *= 2
lines = lines[-nlines-1:]
print "\n".join(lines)
if __name__ == "__main__":
if len(sys.argv) != 3:
print "Syntax: %s n file" % sys.argv[0]
else:
main(int(sys.argv[1]), sys.argv[2])
--
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick
More information about the Python-list
mailing list