[Tutor] re module- puzzling results when matching money
Dominik George
nik at naturalnet.de
Sat Aug 3 22:38:56 CEST 2013
Hi,
\b is defined as all non-word characters, so it is the complement oft \w. \w is [A-Za-z0-9_-], so \b includes \$ and thus cuts off your <sign> group.
-nik
Alex Kleider <akleider at sonic.net> schrieb:
>#!/usr/bin/env python
>
>"""
>I've been puzzling over the re module and have a couple of questions
>regarding the behaviour of this script.
>
>I've provided two possible patterns (re_US_money):
>the one surrounded by the 'word boundary' meta sequence seems not to
>work
>while the other one does. I can't understand why the addition of the
>word
>boundary defeats the match.
>
>I also don't understand why the split method includes the matched text.
>Splitting only works as I would have expected if no goupings are used.
>
>If I've set this up as intended, the full body of this e-mail should be
>executable as a script.
>
>Comments appreciated.
>alex kleider
>"""
>
># file : tutor.py (Python 2.7, NOT Python 3)
>print 'Running "tutor.py" on an Ubuntu Linux machine. *********'
>
>import re
>
>target = \
>"""Cost is $4.50. With a $.30 discount:
>Price is $4.15.
>The price could be less, say $4 or $4.
>Let's see how this plays out: $4.50.60
>"""
>
># Choose one of the following two alternatives:
>re_US_money =\
>r"((?P<sign>\$)(?P<dollars>\d{0,})(?:\.(?P<cents>\d{2})){0,1})"
># The above provides matches.
># The following does NOT.
># re_US_money =\
># r"\b((?P<sign>\$)(?P<dollars>\d{0,})(?:\.(?P<cents>\d{2})){0,1})\b"
>
>pat_object = re.compile(re_US_money)
>match_object = pat_object.search(target)
>if match_object:
> print "'match_object.group()' and 'match_object.span()' yield:"
> print match_object.group(), match_object.span()
> print
>else:
> print "NO MATCH FOUND!!!"
>print
>print "Now will use 'finditer()':"
>
>print
>iterator = pat_object.finditer(target)
>i = 1
>for iter in iterator:
> print
> print "iter #%d: "%(i, ),
> print iter.group()
> print "'groups()' yields: '%s'."%(iter.groups(), )
> print iter.span()
> i += 1
> sign = iter.group("sign")
> dollars = iter.group("dollars")
> cents = iter.group("cents")
> print sign,
> print " ",
> if dollars:
> print dollars,
> else:
> print "00",
> print " ",
> if cents:
> print cents,
> else:
> print "00",
>
>print
>
>t = target
>sub_target = pat_object.sub("<insert value here>", t)
>print
>print "Printing substitution: "
>print sub_target
>split_target = pat_object.split(target)
>print "Result of splitting on the target: "
>print split_target
>
># End of script.
>_______________________________________________
>Tutor maillist - Tutor at python.org
>To unsubscribe or change subscription options:
>http://mail.python.org/mailman/listinfo/tutor
--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130803/c541dcc9/attachment.html>
More information about the Tutor
mailing list