[Tutor] re module- puzzling results when matching money
Alex Kleider
akleider at sonic.net
Sat Aug 3 20:15:28 CEST 2013
#!/usr/bin/env python
"""
I've been puzzling over the re module and have a couple of questions
regarding the behaviour of this script.
I've provided two possible patterns (re_US_money):
the one surrounded by the 'word boundary' meta sequence seems not to
work
while the other one does. I can't understand why the addition of the
word
boundary defeats the match.
I also don't understand why the split method includes the matched text.
Splitting only works as I would have expected if no goupings are used.
If I've set this up as intended, the full body of this e-mail should be
executable as a script.
Comments appreciated.
alex kleider
"""
# file : tutor.py (Python 2.7, NOT Python 3)
print 'Running "tutor.py" on an Ubuntu Linux machine. *********'
import re
target = \
"""Cost is $4.50. With a $.30 discount:
Price is $4.15.
The price could be less, say $4 or $4.
Let's see how this plays out: $4.50.60
"""
# Choose one of the following two alternatives:
re_US_money =\
r"((?P<sign>\$)(?P<dollars>\d{0,})(?:\.(?P<cents>\d{2})){0,1})"
# The above provides matches.
# The following does NOT.
# re_US_money =\
# r"\b((?P<sign>\$)(?P<dollars>\d{0,})(?:\.(?P<cents>\d{2})){0,1})\b"
pat_object = re.compile(re_US_money)
match_object = pat_object.search(target)
if match_object:
print "'match_object.group()' and 'match_object.span()' yield:"
print match_object.group(), match_object.span()
print
else:
print "NO MATCH FOUND!!!"
print
print "Now will use 'finditer()':"
print
iterator = pat_object.finditer(target)
i = 1
for iter in iterator:
print
print "iter #%d: "%(i, ),
print iter.group()
print "'groups()' yields: '%s'."%(iter.groups(), )
print iter.span()
i += 1
sign = iter.group("sign")
dollars = iter.group("dollars")
cents = iter.group("cents")
print sign,
print " ",
if dollars:
print dollars,
else:
print "00",
print " ",
if cents:
print cents,
else:
print "00",
print
t = target
sub_target = pat_object.sub("<insert value here>", t)
print
print "Printing substitution: "
print sub_target
split_target = pat_object.split(target)
print "Result of splitting on the target: "
print split_target
# End of script.
More information about the Tutor
mailing list