Faster way to do this?

Skip Montanaro skip at pobox.com
Wed May 21 09:06:19 EDT 2003


    >> Another, quite a bit faster method. Would generating a list of all
    >> possible words, then looking for them as dictionary keys be any
    >> faster?

    Greg> This will work, but be careful if your string length gets very
    Greg> long.  At 6 letters, you've got 720 permutations which would go
    Greg> rather quickly and not each too much memory.  At 8 letters, it
    Greg> jumps to 40,320 permutations and at 10, it's already above 3.5
    Greg> million.

In private email with Freddie (I think he got confused by my cc to him in my
original post) I sent him a slight modification of my original script
(appended below).  It records all prefixes of the words it places in the
dictionary, greatly increasing the startup time (from about 1 second to
about 10 seconds), but also pruning the permutation tree quite a bit.
Runtime looks like this on my Powerbook:

    % python findwords.py somnambulistexplos
    loaded 832166 words into dictionary in 11.02 seconds
    found 3749 words in 10.43 seconds

Warning: I wrote this script around 2am.  I'm sure it could use further
improvement, and while I believe it accurately computed the hidden words
correctly, I'm not motivated to exhaustively check each of the 3749 words it
found in "somnambulistexplos". ;-)

Skip

import sys
import time

WORD_FILE = "/usr/share/dict/words"

start = time.time()

dictionary = {"p:": True}
wf = open(WORD_FILE, 'r')
for line in wf:
    word = line.strip()
    dictionary[word] = True
    for i in range(1,len(word)):
        dictionary["p:"+word[0:i]] = True
print "loaded %d words into dictionary in %.2f seconds" % (len(dictionary),
                                                           time.time()-start)

def findwords(dictionary, letters, prefix="", seen={}):
    if "p:"+prefix not in dictionary:
        return seen
    for i in range(len(letters)):
        newprefix = prefix+letters[i]
        if newprefix in dictionary and newprefix not in seen:
            seen[newprefix] = True
        findwords(dictionary, letters[0:i]+letters[i+1:], newprefix, seen)
    return seen

if __name__ == "__main__":
    start = time.time()
    seen = findwords(dictionary, sys.argv[1].lower())
    print "found %d words in %.2f seconds" % (len(seen), time.time()-start)






More information about the Python-list mailing list