[Tutor] How to find a word in a string
Alan Gauld
alan.gauld at yahoo.co.uk
Tue May 4 04:53:27 EDT 2021
On 04/05/2021 02:09, Phil wrote:
> This is a bit trickier that I had at first thought, for example:
If you generalize the problem to parsing strings as actual text it
becomes a lot more difficult than you might think. Programming
languages are not natural language aware so they have no concept
of punctuation, words, phrases or sentences etc. For that you
need a natural language toolkit. Python has a library for this,
NLTK, but it has its own steepish learning curve and you have
to decide when your needs require its assistance.
However, the point is that without such a library you need to
do a lot of work to handle punctuation sensibly and reliably.
> test = 'Do this, then do that.'
>
> if 'this' in test.lower().split():
> print('found')
> else:
> print('not found')
> rather than only 'is' as a word. I also thought about striping all
> punctuation but that seems to be unnecessarily complicated.
One way or another you need to deal with the punctuation
since Python can't. There are multiple options and I see
others have covered the most likely choices - regex and
translate - but there is a whole world of special cases
waiting to catch you out. Parsing natural language is
horrible.
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos
More information about the Tutor
mailing list