
Dear Idealists, I notice that when you do use the re module grouping, that it only tells you what it matched last: Dumb Real Python Code:
Cool Improved Python Code * * Now, we see all that it matched. Now the problem with this and all ideas is reverse compatibility. So an addition is also too.
Notice how I added an extra P. I also made it so that matching it in the text is also more adaptable. Please consider this idea. Sincerely, Me

Could you elaborate on the change? I don't understand your modification. The regex is a different one than the original, as well. I do agree that remembering all the groups would be nice, at least if it could be done reasonably. Devin On Sun, Jul 31, 2011 at 8:36 PM, Christopher King <g.nius.ck@gmail.com> wrote:

On Sun, Jul 31, 2011 at 8:41 PM, Devin Jeanpierre <jeanpierreda@gmail.com>wrote:
Could you elaborate on the change? I don't understand your modification. The regex is a different one than the original, as well.
What do you mean by elaborate on the change. You mean explain. I guess I could do it in more detail. What would happen is if you do something like. match=re.search('^(?*PP*<tag>[a-z])*$', 'abc') Then the match.groupdict() would return {'tag.0':'a', 'tag.1':'b', 'tag.2':'c', 'tag.-1':'c', 'tag.-2':'b', 'tag.-3':'a'} notice the PP. This means that it will save all the times it matches. It does this by adding a decimal after the tag to show the index. It also supports negative indexing in case you want the last time it matched. All these can be used with the old (?P=tag.-2) with it. Also, are there any forbidden characters in a tag. That would be good to add so it won't mess with current tags.

On Mon, Aug 1, 2011 at 10:56 AM, Christopher King <g.nius.ck@gmail.com> wrote:
By elaborate on the change, I expect Devin meant a more accurate description of the problem you're trying to solve without the confusing and irrelevant noise about named groups. Specifically:
You're asking for '*' and '+' to change the group numbers based on the number of matches that actually occur. This is untenable, which should become clear as soon as another group is placed after the looping constructs:
Group names/numbers are assigned when the regex is compiled. They cannot be affected by runtime information based on the string being processed. The way to handle this (while still using the re module to do the parsing) is multi-level parsing:
There's no reason to try to embed the functionality of finditer() into the regex itself (and it's utterly impractical to do so anyway). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Am 01.08.2011 02:36, schrieb Christopher King:
The "regex" module by Matthew Barnett already supports this: https://code.google.com/p/mrab-regex-hg/ Georg

On Mon, Aug 1, 2011 at 3:12 PM, Georg Brandl <g.brandl@gmx.net> wrote:
The "regex" module by Matthew Barnett already supports this:
The PyPI page is more helpful, since it has the docs: http://pypi.python.org/pypi/regex (the relevant section is the captures() API under "Repeated captures") So clearly it sets up the additional storage under the hood when the pattern is compiled. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Could you elaborate on the change? I don't understand your modification. The regex is a different one than the original, as well. I do agree that remembering all the groups would be nice, at least if it could be done reasonably. Devin On Sun, Jul 31, 2011 at 8:36 PM, Christopher King <g.nius.ck@gmail.com> wrote:

On Sun, Jul 31, 2011 at 8:41 PM, Devin Jeanpierre <jeanpierreda@gmail.com>wrote:
Could you elaborate on the change? I don't understand your modification. The regex is a different one than the original, as well.
What do you mean by elaborate on the change. You mean explain. I guess I could do it in more detail. What would happen is if you do something like. match=re.search('^(?*PP*<tag>[a-z])*$', 'abc') Then the match.groupdict() would return {'tag.0':'a', 'tag.1':'b', 'tag.2':'c', 'tag.-1':'c', 'tag.-2':'b', 'tag.-3':'a'} notice the PP. This means that it will save all the times it matches. It does this by adding a decimal after the tag to show the index. It also supports negative indexing in case you want the last time it matched. All these can be used with the old (?P=tag.-2) with it. Also, are there any forbidden characters in a tag. That would be good to add so it won't mess with current tags.

On Mon, Aug 1, 2011 at 10:56 AM, Christopher King <g.nius.ck@gmail.com> wrote:
By elaborate on the change, I expect Devin meant a more accurate description of the problem you're trying to solve without the confusing and irrelevant noise about named groups. Specifically:
You're asking for '*' and '+' to change the group numbers based on the number of matches that actually occur. This is untenable, which should become clear as soon as another group is placed after the looping constructs:
Group names/numbers are assigned when the regex is compiled. They cannot be affected by runtime information based on the string being processed. The way to handle this (while still using the re module to do the parsing) is multi-level parsing:
There's no reason to try to embed the functionality of finditer() into the regex itself (and it's utterly impractical to do so anyway). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Am 01.08.2011 02:36, schrieb Christopher King:
The "regex" module by Matthew Barnett already supports this: https://code.google.com/p/mrab-regex-hg/ Georg

On Mon, Aug 1, 2011 at 3:12 PM, Georg Brandl <g.brandl@gmx.net> wrote:
The "regex" module by Matthew Barnett already supports this:
The PyPI page is more helpful, since it has the docs: http://pypi.python.org/pypi/regex (the relevant section is the captures() API under "Repeated captures") So clearly it sets up the additional storage under the hood when the pattern is compiled. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (6)
-
Christopher King
-
Devin Jeanpierre
-
Georg Brandl
-
Greg Ewing
-
Nick Coghlan
-
Tim Lesher