how can one index (text documents) for efficient similar word search? existing modules? what principles are used by search engines therefore?