[issue6931] awful performance in difflib: ndiff and HtmlDiff
Heiðar Rafn Harðarson
report at bugs.python.org
Thu Sep 17 16:54:58 CEST 2009
New submission from Heiðar Rafn Harðarson <heidar.rafn at hrolfsskali.net>:
Relatively small set of lines with differences in most lines can destroy
the performance of difflib.HtmlDiff.make_table and difflib.ndiff.
I am using it like this:
...
htmldiffer = HtmlDiff()
return htmldiffer.make_table(src_lines, dst_lines,
fromdesc="file1",
todesc="file2",
context=True)
I have written the src_lines and dst_lins to files and tried this with
the Tools/scripts/diff.py wrapper with same results when using the
switches -m or -n.
The performance is fine when using difflib.unified_diff or switch -u on
diff.py
Attached are files that show this clearly.
left200.txt,right200.txt - 200 lines of text - duration 11 seconds.
left500.txt,right500.txt - 500 lines of text - duration 2min 58 sec
left1000.txt,right1000.txt - 1000 lines of text - duration 29min 4sec
tested on Intel dualcore T2500 2GHz with 2 GB of memory, python 2.5.2 on
Ubuntu. Same problom on python 2.6 on Fedora-11
For reference, the kdiff3 utility performs beautifully on these files.
----------
components: Library (Lib)
files: python.difflib.bug.tgz
messages: 92768
nosy: heidar.rafn
severity: normal
status: open
title: awful performance in difflib: ndiff and HtmlDiff
type: performance
versions: Python 2.5, Python 2.6
Added file: http://bugs.python.org/file14911/python.difflib.bug.tgz
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6931>
_______________________________________
More information about the Python-bugs-list
mailing list