[Tutor] Textparsing, a better way?
Zak Arntson
zak@harlekin-maus.com
Tue May 6 12:42:03 2003
I'm working on my text adventure text parser (think Zork), and have
created the following code to turn a sentence into a list of words and
punctuation. E.g.: "Sailor, throw me the bottle. Get bottle" ->
['sailor',',','throw','me','the','bottle','.','get','bottle']
Here's my current code, but I can't help thinking there are areas for
improvement. Any suggestions/comments? I couldn't find a way for a regular
expression to create a list of all of its matches. I'd love to do
something like re.compile ('(\w+)|([\.,:;])') and have that drive
something to make a list of all occuring blocks of that reg exp.
###
def textparse (rawSentence):
sentence = []
reWord = re.compile (r'([\.,:;])')
for chunk in re.compile (r'\s').split (rawSentence.strip ().lower ()):
# first get rid of whitespace
for word in reWord.split (chunk): # now separate puncuation from
words
if word:
sentence.append (word)
return sentence
###
--
Zak Arntson
www.harlekin-maus.com - Games - Lots of 'em