[Tutor] Textparsing, a better way?
Zak Arntson
zak@harlekin-maus.com
Tue May 6 15:03:01 2003
> At 09:41 2003-05-06 -0700, Zak Arntson wrote:
>>I'm working on my text adventure text parser (think Zork), and have
>> created the following code to turn a sentence into a list of words and
>> punctuation. E.g.: "Sailor, throw me the bottle. Get bottle" ->
>>['sailor',',','throw','me','the','bottle','.','get','bottle']
>
> Is this what you want? (I threw in support for ?, ! and - as well.)
>
> >>> t = "Sailor, throw me the bottle. Get bottle"
> >>> b = re.compile(r'\S+?\b|[\.,:;\-\?!]')
> >>> b.findall(t.lower())
> ['sailor', ',', 'throw', 'me', 'the', 'bottle', '.', 'get', 'bottle']
Oh man. I completely missed the findall method, even _after_ going to help
documentation and the dir(). Thank you tons!
And thanks for throwing in the extra functionality, to boot! And here was
lamenting the absence of a 'get a list of matches' functionality. :)
Note, for anyone who'll be following my text adventure code in the future:
I need '-' to be part of a word (like 'fixed-width' or 'blue-green'), so
I'm going to change the above expression to: r'[\S\-]+?\b|[\.,:;\-\?!]'
Thanks again!
--
Zak Arntson
www.harlekin-maus.com - Games - Lots of 'em