<div dir="ltr">Oh. Yes, that is being discussed about once a year two. It seems Matthew isn't very interested in helping out with the port, and there are some concerns about backwards compatibility with the `re` module. I think it needs a champion!<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Oct 27, 2017 at 8:50 AM, Tim Peters <span dir="ltr"><<a href="mailto:tim.peters@gmail.com" target="_blank">tim.peters@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Note that Matthew Barnett's `regex` module already supports \G, and a<br>
great many other features that weren't around 15 years ago ;-) either:<br>
<br>
<a href="https://pypi.python.org/pypi/regex/" rel="noreferrer" target="_blank">https://pypi.python.org/pypi/<wbr>regex/</a><br>
<br>
I haven't followed this in detail. I'm just surprised once per year<br>
that it hasn't been folded into the core ;-)<br>
<br>
[nothing new below]<br>
<div><div class="h5"><br>
On Fri, Oct 27, 2017 at 10:35 AM, Guido van Rossum <<a href="mailto:guido@python.org">guido@python.org</a>> wrote:<br>
> The "why" question is not very interesting -- it probably wasn't in PCRE and<br>
> nobody was familiar with it when we moved off PCRE (maybe it wasn't even in<br>
> Perl at the time -- it was ~15 years ago).<br>
><br>
> I didn't understand your description of \G so I googled it and found a<br>
> helpful StackOverflow article:<br>
> <a href="https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex" rel="noreferrer" target="_blank">https://stackoverflow.com/<wbr>questions/21971701/when-is-g-<wbr>useful-application-in-a-regex</a>.<br>
> From this I understand that when using e.g. findall() it forces successive<br>
> matches to be adjacent.<br>
><br>
> In general this seems to be a unique property of \G: it preserves *state*<br>
> from one match to the next. This will make it somewhat difficult to<br>
> implement -- e.g. that state should probably be thread-local in case<br>
> multiple threads use the same compiled regex. It's also unclear when that<br>
> state should be reset. (Only when you compile the regex? Each time you pass<br>
> it a different source string?)<br>
><br>
> So I'm not sure it's reasonable to add. But I also don't see a reason why it<br>
> shouldn't be added -- presuming we can decide on good answer for the<br>
> questions above about the "scope" of the anchor.<br>
><br>
> I think it's okay to start a discussion on <a href="http://bugs.python.org" rel="noreferrer" target="_blank">bugs.python.org</a> about the precise<br>
> specification of \G for Python. OTOH I expect that most core devs won't find<br>
> this a very interesting problem (Python relies on regexes for parsing a lot<br>
> less than Perl does).<br>
><br>
> Good luck!<br>
><br>
> On Thu, Oct 26, 2017 at 11:03 PM, Ed Peschko <<a href="mailto:horos22@gmail.com">horos22@gmail.com</a>> wrote:<br>
>><br>
>> All,<br>
>><br>
>> perl has a regex assertion (\G) that allows multiple-match regular<br>
>> expressions to be able to use the position of the last match. Perl's<br>
>> documentation puts it this way:<br>
>><br>
>> \G Match only at pos() (e.g. at the end-of-match position of prior<br>
>> m//g)<br>
>><br>
>> Anyways, this is exceedingly powerful for matching regularly<br>
>> structured free-form records, and I was really surprised when I found<br>
>> out that python did not have it. For example, if findall supported<br>
>> this, it would be possible to write things like this (a quick and<br>
>> dirty ifconfig parser):<br>
>><br>
>> pat = re.compile(r'\G(\S+)(.*?\n)(?=<wbr>\S+|\Z)', re.S)<br>
>><br>
>> val = """<br>
>> eth2 Link encap:Ethernet HWaddr xx<br>
>> inet addr: xx.xx.xx.xx Bcast:xx.xx.xx.xx Mask:xx.xx.xx.xx<br>
>> ...<br>
>> lo Link encap:Local Loopback<br>
>> inet addr:127.0.0.1 Mask:255.0.0.0<br>
>> """<br>
>> matches = re.findall(pat, val)<br>
>><br>
>> So - why doesn't python have this? is it something that simply was<br>
>> overlooked, or is there another method of doing the same thing with<br>
>> arbitrarily complex freeform records?<br>
>><br>
>> thanks much..<br>
>> ______________________________<wbr>_________________<br>
>> Python-Dev mailing list<br>
>> <a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>
>> <a href="https://mail.python.org/mailman/listinfo/python-dev" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-dev</a><br>
>> Unsubscribe:<br>
>> <a href="https://mail.python.org/mailman/options/python-dev/guido%40python.org" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/options/python-dev/<wbr>guido%40python.org</a><br>
><br>
><br>
><br>
><br>
> --<br>
> --Guido van Rossum (<a href="http://python.org/~guido" rel="noreferrer" target="_blank">python.org/~guido</a>)<br>
><br>
> ______________________________<wbr>_________________<br>
> Python-Dev mailing list<br>
> <a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>
> <a href="https://mail.python.org/mailman/listinfo/python-dev" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-dev</a><br>
> Unsubscribe:<br>
</div></div>> <a href="https://mail.python.org/mailman/options/python-dev/tim.peters%40gmail.com" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/options/python-dev/<wbr>tim.peters%40gmail.com</a><br>
><br>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">--Guido van Rossum (<a href="http://python.org/~guido" target="_blank">python.org/~guido</a>)</div>
</div>