prefix search on a large file
__peter__ at web.de
Thu Oct 12 19:23:34 CEST 2006
> By eliminating list cloning, my function got much faster than before.
> I really appreciate you, John.
> def prefixdel_recursively2(alist):
> if len(alist) < 2:
> return alist
> first = alist.pop(0)
> unneeded = [no for no, line in enumerate(alist) if
> line.startswith(first)] adjust=0
> for i in unneeded:
> del alist[i+adjust]
> adjust -= 1
> return [first] + prefixdel_recursively(alist)
> process stime
> prefixdel_stupidly : 11.9247150421
> prefixdel_recursively : 14.6975700855
> prefixdel_recursively2 : 0.408113956451
> prefixdel_by_john : 7.60227012634
Those are suspicious results. Time it again with number=1, or a fresh copy
of the data for every iteration.
I also have my doubts whether sorting by length is a good idea. To take it
to the extreme: what if your data file contains an empty line?
More information about the Python-list