prefix search on a large file
js
ebgssth at gmail.com
Fri Oct 13 09:06:45 EDT 2006
I did the test the way you suggested. It took not so long to realize
stupid mistakes I made. Thank you.
The following is the result of test with timeit(number=10000)
using fresh copy of the list for every iteration
0.331462860107
0.19401717186
0.186257839203
0.0762069225311
I tried my recursive-function to fix up my big-messed-list.
It stops immediately because of 'RuntimeError: maximum recursion limit exceeded'
I hope this trial-and-errors getting me good place...
anyway, thank you.
On 10/13/06, Peter Otten <__peter__ at web.de> wrote:
> js wrote:
>
> > By eliminating list cloning, my function got much faster than before.
> > I really appreciate you, John.
> >
> > def prefixdel_recursively2(alist):
> > if len(alist) < 2:
> > return alist
> >
> > first = alist.pop(0)
> > unneeded = [no for no, line in enumerate(alist) if
> > line.startswith(first)] adjust=0
> > for i in unneeded:
> > del alist[i+adjust]
> > adjust -= 1
> >
> > return [first] + prefixdel_recursively(alist)
> >
> >
> > process stime
> > prefixdel_stupidly : 11.9247150421
> > prefixdel_recursively : 14.6975700855
> > prefixdel_recursively2 : 0.408113956451
> > prefixdel_by_john : 7.60227012634
>
> Those are suspicious results. Time it again with number=1, or a fresh copy
> of the data for every iteration.
>
> I also have my doubts whether sorting by length is a good idea. To take it
> to the extreme: what if your data file contains an empty line?
>
> Peter
> --
> http://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list