Python Grep (was: writing (Gnu)MAKE in Python)

Eric Hagemann ehagemann at
Sat Jun 17 19:42:27 EDT 2000

If your file can fit in memory I believe you will find that reading the
whole thing in (using readlines() ), then searching over the lines is a bit
faster.  I was doing something similar for files in the 5MB range and the
speed dropped from 5 sec to about 2 sec ( I was doing more than the grep but

Also you might have better luck with the regular expression stuff (rather
than the find command) and precompiling the string


"Doug Stanfield" <DOUGS at> wrote in message
news:8457258D741DD411BD3D0050DA62365907A23A at
> [John said:]
> > > ...I would like to have Bash, sed, and grep as Python
> > > programs rather than compiled C.
> >
> [and Corageous asked:]
> > Why?
> Consider this an attempt at translation.
> There have been posts in the past about having a Python shell.  Perhaps
> thats what John wants.  Not that I'd like that.  It may be that John wants
> the functionality of sed and grep more easily accessible in Python.  That
> something I care about (thus the self serving attempt at manipulating this
> thread. ;-)
> Grep functions in particular are something that I wonder about.  The
> following is an experiment:
> #!/usr/bin/python
> #
> #
> #
> #
> """     An attempt to compare the use of the grep
>         command with a 'Pythonic' method of finding
>         all lines in a file that contain a string.
> """
> import string, os
> def pygrep(the_string,the_file):
>     """ Search for and return all occurrances of lines
>         in the_file that contain the_string.
>         This is simplistic and unprotected but its
>         purpose is only to learn how to make it fast. """
>     find = string.find
>     holder = open(the_file,'r')
>     while 1:
>         line = holder.readline()
>         if not line:
>             break
>         if find(line,the_string) <> -1:
>             print line
> def mygrep(the_string,the_file):
>     """ This is usually what I do when I need this. """
>     command = 'grep %s %s' % (the_string,the_file)
>     response = os.popen(command,'r')
>     lines =
>     print lines
> if __name__ == '__main__':
>     import time
>     test_file = "/usr/local/devices/motor.10"
>     test_string = "26086594"
>     first = time.time()
>     pygrep(test_string,test_file)
>     second = time.time()
>     mygrep(test_string,test_file)
>     third = time.time()
>     print "Python: %s" % ((second - first),)
>     print "OS    : %s" % ((third - second),)
> I run this and get:
> $ ./
> 94.89.251,
> 94.89.251,
> 94.89.251,
> 94.89.251,
> 94.89.251,
> 94.89.251,
> Python: 6.7979799509
> OS    : 0.0741490125656
> Am I missing a Pydiom that would narrow the gap or is using the OS
> the best way?
> -Doug-

More information about the Python-list mailing list