On Mon, Aug 1, 2011 at 10:56 AM, Christopher King
On Sun, Jul 31, 2011 at 8:41 PM, Devin Jeanpierre
wrote: Could you elaborate on the change? I don't understand your modification. The regex is a different one than the original, as well.
What do you mean by elaborate on the change. You mean explain. I guess I could do it in more detail.
By elaborate on the change, I expect Devin meant a more accurate description of the problem you're trying to solve without the confusing and irrelevant noise about named groups. Specifically:
match=re.search('^([a-z])*$', 'abcz') match.groups() ('z',)
You're asking for '*' and '+' to change the group numbers based on the number of matches that actually occur. This is untenable, which should become clear as soon as another group is placed after the looping constructs:
match=re.search('^([a-y])*(.*)$', 'abcz') match.groups() ('c', 'z')
Group names/numbers are assigned when the regex is compiled. They cannot be affected by runtime information based on the string being processed. The way to handle this (while still using the re module to do the parsing) is multi-level parsing:
match=re.search('^([a-z]*)$', 'abcz') relevant = match.group(0) pattern = re.compile('([a-z])') for match in pattern.finditer(relevant): ... print(match.groups()) ... ('a',) ('b',) ('c',) ('z',)
There's no reason to try to embed the functionality of finditer() into the regex itself (and it's utterly impractical to do so anyway). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia