2011/11/25 Serhat Sevki Dincer <jfcgauss@gmail.com>:
I wrote a tiny grep with multi-line match support, and compared its speed under pypy 1.7 with grep and CPython 2.7.1 (on ubuntu 11.04 laptop). No special algorithm/implementation is employed; it is bare re module.
input: Plone 4.1.2 eggs directory, size 286mb, possible processed input size is about 75mb, processed 3958 files total
commands:
time mgrp -lcrN '\.py$' for . takes 1.95s
time python2.7 /usr/local/bin/mgrp -lcrN '\.py$' for . takes 1.45s
time grep -lcr --color=none --include='*.py' for . takes 0.6s
Is the input too small to see the benefits of pypy?
It would instructive to see the code, but if what you're expecting it to be as fast as grep, think again. It has extremely well-tuned clever algorithms. -- Regards, Benjamin