Regular expressions, help?

Chris Angelico rosuav at gmail.com
Thu Apr 19 11:21:43 EDT 2012


On Fri, Apr 20, 2012 at 1:07 AM, Sania <fantasyblue82 at gmail.com> wrote:
> Azrazer what you suggested works but I need to make sure that it
> catches numbers like 6,370 as well as 637. And I tried tweaking the
> regex around from the one you said in your reply but It didn't work
> (probably would have if I was more adept). But thanks!

Okay. Here's a general principle when working with regular
expressions: First look for a negated set, then look for a positive
set. For instance:

death toll[^0-9,]*([0-9,]+)

Note the parallel between what's inside the grouping parentheses and
what's before them. You're telling the regex parser to ignore
everything that's not digit or comma, then consume everything that's
digit or comma. (I'm simplifying this by working with ASCII-only. YMMV
if you need to handle other definitions of "digit"; the same principle
applies.)

The other option is to use dot, but non-greedily. This accomplishes
the same thing:

death toll.*?([0-9,]+)

I strongly recommend you pick up a hefty document on regular
expressions and get to know them thoroughly. It's an investment of
time, but you'll be working with less magic and more tools. In fact,
I'd recommend that for anyone who's doing more than the most trivial
work with regexps.

ChrisA



More information about the Python-list mailing list