difflib optimization -- wow (tim peters?)

jepler at unpythonic.net jepler at unpythonic.net
Mon Sep 16 16:25:05 EDT 2002


wow, what a speedup!   thanks, whoever it was (tim?)

    $ time python2.3 -O ~/pyunidiff.py foo.c foo2.c > /tmp/pat

    real    0m5.457s
    user    0m4.640s
    sys     0m0.110s
    $ time python2.2 -O ~/pyunidiff.py foo.c foo2.c > /tmp/pat

    real    3m14.042s
    user    3m7.480s
    sys     0m1.180s

foo.c and foo2.c are ~300k long C files from different releases of a
piece of software.  the diff produced is nearly 250k.

pyunidiff is at http://unpythonic.net/~jepler/pyunidiff.py

tim, I only need another factor of 30x and it'll beat diff(1) -u in wall
time (or 5x to beat diff(1) -u -d).  It already gives files about 5%
shorter on the files I've tried (or on par with diff -u -d).  I notice
that the diff produced from 2.2 is marginally smaller, another 1.5% or so.
The price to pay for the optimization?

Jeff




More information about the Python-list mailing list