Can I beat perl at grep-like processing speed?
Tim Smith
tim at bytesmith.us
Fri Dec 29 10:51:30 EST 2006
you may not be able to beat perl's regex speed, but you can take some steps to speed up your python program using map and filter
here's a modified python program that will do your search faster
#!/usr/bin/env python
import re
r = re.compile(r'destroy', re.IGNORECASE)
def stripit(x):
return x.rstrip("\r\n")
print "\n".join( map(stripit, filter(r.search, file('bigfile'))) )
#time comparison on my machine
real 0m0.218s
user 0m0.210s
sys 0m0.010s
real 0m0.464s
user 0m0.450s
sys 0m0.010s
#original time comparison on my machine
real 0m0.224s
user 0m0.220s
sys 0m0.010s
real 0m0.508s
user 0m0.510s
sys 0m0.000s
also, if you replace the regex with a test like lambda x: x.lower().find("destroy") != -1, you will get really close to the speed of perl's (its possible perl will even take this shortcut when getting such a simple regex
#here's the times when doing the search this way
real 0m0.221s
user 0m0.210s
sys 0m0.010s
real 0m0.277s
user 0m0.280s
sys 0m0.000s
-- Tim
-- On 12/29/06 "js " <ebgssth at gmail.com> wrote:
> Just my curiosity.
> Can python beats perl at speed of grep-like processing?
>
> $ wget http://www.gutenberg.org/files/7999/7999-h.zip
> $ unzip 7999-h.zip
> $ cd 7999-h
> $ cat *.htm > bigfile
> $ du -h bigfile
> du -h bigfile
> 8.2M bigfile
>
> ---------- grep.pl ----------
> #!/usr/local/bin/perl
> open(F, 'bigfile') or die;
>
> while(<F>) {
> s/[\n\r]+$//;
> print "$_\n" if m/destroy/oi;
> }
> ---------- END ----------
> ---------- grep.py ----------
> #!/usr/bin/env python
> import re
> r = re.compile(r'destroy', re.IGNORECASE)
>
> for s in file('bigfile'):
> if r.search(s): print s.rstrip("\r\n")
> ---------- END ----------
>
> $ time perl grep.pl > pl.out; time python grep.py > py.out
> real 0m0.168s
> user 0m0.149s
> sys 0m0.015s
>
> real 0m0.450s
> user 0m0.374s
> sys 0m0.068s
> # I used python2.5 and perl 5.8.6
> --
> http://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list