Number of distinct substrings of a string [continuation]

n00m n00m at narod.ru
Mon Nov 30 02:36:16 EST 2009


On Nov 29, 11:43 pm, Bearophile <bearophileH... at lycos.com> wrote:
> Anyway, you may try a pure-Python2.x implementation:http://suffixtree.googlecode.com/files/suffixtree-0.1.py

Ouch, Bearie... 214 lines of code are there.
Ok.. I tested it.
Firstly I replaced all "print " with "pass##print "
(aiming to avoid intensive printing from inside of the code).

Then I left in its final part only building of suffix trees.

==================================================================
...
...
...

from time import time
t = time()

import sys
sys.stdin = open('D:/88.txt', 'rt')
f = sys.stdin.read().split()
sys.stdin.close()

z = open('D:/99.txt', 'wt')

for tc in range(int(f[0])):
    s = f[tc + 1]
    test_str = s + '$'
    POSITIVE_INFINITY = len(test_str) - 1
    suffix_tree = SuffixTree(test_str)
    print >> z, 'len(s) =', len(s)

print >> z, 'time =', time() - t
z.close()
============================================================

Output:
============================================================
len(s) = 1000
len(s) = 1000
len(s) = 10000
time = 0.641000032425
============================================================

0.64s > 0.48s (of my algo)
I.e. the code can't help in my very special and narrow case.

But of course it is worthy to be studied and memoized.
E.g.:
from huge string "s" we built its Suffix Tree.
Now we are given arbitrary string "ss".
Task: to find the largest its prefix occured in "s",
traversing its Suffix Tree.







More information about the Python-list mailing list