[Tutor] Regular expressions: findall vs search

Alexander Q. redacted@example.com
Tue Jul 10 21:26:42 CEST 2012


I'm a bit confused about extracting data using re.search or re.findall.

Say I have the following code: tuples =
re.findall(r'blahblah(\d+)yattayattayatta(\w+)moreblahblahblah(\w+)over',
text)

So I'm looking for that string in 'text', and I intend to extract the parts
which have parentheses around them. And it works: the variable "tuples",
which I assigned to get the return of re.findall, returns a tuple list,
each 'element' therein being a tuple of 3 elements (which is what I wanted
since I had 3 sets of parentheses).

My question is how does Python know to return just the part in the
parentheses and not to return the "blahblah" and the "yattayattayatta",
etc...? The 're.search' function returns the whole thing, and if I want
just the parentheses parts, I do tuples.group(1) or tuples.group(2) or
tuples.group(3), depending on which set of parentheses I want. Does the
re.findall command by default ignore anything outside of the parentheses
and only return the parentheses as a grouping withing one tuple (i.e., the
first element in "tuples" would be, as it is, a list comprised of 3
elements corresponding respectively to the 1st, 2nd, and 3rd parentheses)?
Thank you for reading.

-Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120710/6e0add0a/attachment.html>


More information about the Tutor mailing list
l>