[Tutor] re module- puzzling results when matching money

Dominik George nik at naturalnet.de
Sat Aug 3 22:38:56 CEST 2013


Hi,

\b is defined as all non-word characters, so it is the complement oft \w. \w is [A-Za-z0-9_-], so \b includes \$ and thus cuts off your <sign> group.

-nik



Alex Kleider <akleider at sonic.net> schrieb:
>#!/usr/bin/env python
>
>"""
>I've been puzzling over the re module and have a couple of questions
>regarding the behaviour of this script.
>
>I've provided two possible patterns (re_US_money):
>the one surrounded by the 'word boundary' meta sequence seems not to 
>work
>while the other one does. I can't understand why the addition of the 
>word
>boundary defeats the match.
>
>I also don't understand why the split method includes the matched text.
>Splitting only works as I would have expected if no goupings are used.
>
>If I've set this up as intended, the full body of this e-mail should be
>executable as a script.
>
>Comments appreciated.
>alex kleider
>"""
>
># file :  tutor.py (Python 2.7, NOT Python 3)
>print 'Running "tutor.py" on an Ubuntu Linux machine. *********'
>
>import re
>
>target = \
>"""Cost is $4.50. With a $.30 discount:
>Price is $4.15.
>The price could be less, say $4 or $4.
>Let's see how this plays out:  $4.50.60
>"""
>
># Choose one of the following two alternatives:
>re_US_money =\
>r"((?P<sign>\$)(?P<dollars>\d{0,})(?:\.(?P<cents>\d{2})){0,1})"
># The above provides matches.
># The following does NOT.
># re_US_money =\
># r"\b((?P<sign>\$)(?P<dollars>\d{0,})(?:\.(?P<cents>\d{2})){0,1})\b"
>
>pat_object = re.compile(re_US_money)
>match_object = pat_object.search(target)
>if match_object:
>     print "'match_object.group()' and 'match_object.span()' yield:"
>     print match_object.group(), match_object.span()
>     print
>else:
>     print "NO MATCH FOUND!!!"
>print
>print "Now will use 'finditer()':"
>
>print
>iterator = pat_object.finditer(target)
>i = 1
>for iter in iterator:
>     print
>     print "iter #%d: "%(i, ),
>     print iter.group()
>     print "'groups()' yields: '%s'."%(iter.groups(), )
>     print iter.span()
>     i += 1
>     sign = iter.group("sign")
>     dollars = iter.group("dollars")
>     cents = iter.group("cents")
>     print sign,
>     print "  ",
>     if dollars:
>         print dollars,
>     else:
>         print "00",
>     print "  ",
>     if cents:
>         print cents,
>     else:
>         print "00",
>
>print
>
>t = target
>sub_target = pat_object.sub("<insert value here>", t)
>print
>print "Printing substitution: "
>print sub_target
>split_target = pat_object.split(target)
>print "Result of splitting on the target: "
>print split_target
>
># End of script.
>_______________________________________________
>Tutor maillist  -  Tutor at python.org
>To unsubscribe or change subscription options:
>http://mail.python.org/mailman/listinfo/tutor

-- 
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130803/c541dcc9/attachment.html>


More information about the Tutor mailing list