[Tutor] My Spring project! [A dictionary that allows substrings for key lookup]

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Mon, 18 Feb 2002 14:26:39 -0800 (PST)

Hi everyone,

I've heard so much good stuff about making extension libraries in SWIG,


that I finally put my foot down and tried to actually apply it toward a
real project.

I've written a wrapper around Dan Gusfield's 'strmat' Suffix Tree library,


and I think it's all quite crazy.  *grin* If anyone's interested, I've
collected my notes and code here:


As a warning: there are definitely memory leaks in the wrapper.  Also,
parts of the 'strmat' library haven't been wrapped yet because they look a
little... well, fragile.  But it's still very interesting, and perhaps
even useful!

As an example, I've written a small SubstringDict class that allows one to
use substrings of dictionary keys:

[Adyoo@coffeetable:~/pystrmat-0.6$ python SubstringDict.py
Reading the dictionary.  Please wait.
45392 words now indexed.
Please enter a word; I'll look for all entries in the dictionary
that have the word as a substring.
word? py
59 words found: 
['espy', 'copy', 'swampy', 'skimpy', 'sleepy', 'photocopy',
'photocopying', 'Skippy', 'canopy', 'preoccupy', 'anisotropy', 'lumpy',
'puppy', 'occupy', 'occupying', 'drippy', 'pygmies', 'pygmy', 'pyramid',
'papyrus', 'floppy', 'physiotherapy', 'Harpy', 'pyramids', 'jumpy',
'snoopy', 'droopy', 'copying', 'copyright', 'copyrightable',
'copyrighted', 'copyrights', 'copywriter', 'unhappy', 'microscopy',
'pyre', 'peppy', 'creepy', 'spy', 'spyglass', 'spying', 'philanthropy',
'sloppy', 'Cappy', 'python', 'entropy', 'psychotherapy', 'shipyard',
'soapy', 'poppy', 'happy', 'hardcopy', 'crappy', 'choppy', 'capybara',
'snappy', 'syrupy', 'spectroscopy', 'therapy']
word? moon
14 words found: 
['mooned', 'mooning', 'moonlight', 'moonlighter', 'moonlighting',
'moonlit', 'moons', 'moonshine', 'honeymoon', 'honeymooned',
'honeymooner', 'honeymooners', 'honeymooning', 'honeymoons']
word? tutor
9 words found: 
['statutorily', 'statutoriness', 'statutory', 'tutor', 'tutored',
'tutorial', 'tutorials', 'tutoring', 'tutors']

This may look mundane, but what's neat about this is that when it looks up
a substring key, it isn't scanning through all of the dictionary.

I still have to do some more optimization and code cleanup, but I hope
that this is useful for people.  I did remember hearing a question about
this on Python-list a while back, so I'll forward this message to the main
list as well.