[Tutor] How to find a word in a string

Alan Gauld alan.gauld at yahoo.co.uk
Tue May 4 04:53:27 EDT 2021


On 04/05/2021 02:09, Phil wrote:
> This is a bit trickier that I had at first thought, for example:

If you generalize the problem to parsing strings as actual text it
becomes a lot more difficult than you might think. Programming
languages are not natural language aware so they have no concept
of punctuation, words, phrases or sentences etc. For that you
need a natural language toolkit. Python has a library for this,
NLTK, but it has its own steepish learning curve and you have
to decide when your needs require its assistance.

However, the point is that without such a library you need to
do a lot of work to handle punctuation sensibly and reliably.

> test = 'Do this, then do that.'
> 
> if 'this' in test.lower().split():
>      print('found')
> else:
>      print('not found')

> rather than only 'is' as a word. I also thought about striping all 
> punctuation but that seems to be unnecessarily complicated.

One way or another you need to deal with the punctuation
since Python can't. There are multiple options and I see
others have covered the most likely choices - regex and
translate - but there is a whole world of special cases
waiting to catch you out.  Parsing natural language is
horrible.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list