What you want is called "natural language processing" and whole research papers have been written about this topic. Search your favorite research paper index for those keywords, say google scholar.