[Patches] [ python-Patches-572936 ] (?(id/name)yes|no) re implementation

SourceForge.net noreply@sourceforge.net
Fri, 13 Jun 2003 22:15:35 -0700


Patches item #572936, was opened at 2002-06-24 03:41
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=572936&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: (?(id/name)yes|no) re implementation

Initial Comment:
This patch implements a regular expression feature, which allows   
some interesting patterns, in the same way as implemented in perl.   
For example, (?(1)yes|no) matches with "yes" if group "1" exists, and   
with "no", if it doesn't. Without this feature, the regular expression   
must be duplicated to get the same results. In addition to perl's 
feature, it will also accept a Python named group as argument. 
   
Here's an example:   
   
(<)?\w+@\w+(\.\w+)+(?(1)>)   
  
This is a poor email matching regular expression, which will match   
with or without the "<>" symbols.   
   

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2003-06-14 07:15

Message:
Logged In: YES 
user_id=21627

Please don't apply the patch before 2.3; this is in beta
now, so no new features are allowed (unless you get BDFL
permission, of course).

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2003-06-14 05:52

Message:
Logged In: YES 
user_id=7887

Martin, I've checked your concern about making "(X)|(?(1)Y)"
an error, and unfortunately the current framework doesn't
implement enough state information to catch this. Notice
that this is not implemented in very similar cases, like
"(X)|\1", which does exactly the same thing as "(X)|(?(1)X)".

I'll be applying that patch as soon as I check it against
the current HEAD, and implement some tests (and before it
completes its first year of life 8-).

Thanks!


----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2003-04-21 00:09

Message:
Logged In: YES 
user_id=7887

I see. I'll try to improve the patch with your suggestions
as soon as I get some time to work on it. Thanks for your
support.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-04-20 23:57

Message:
Logged In: YES 
user_id=21627

Exactly: My example makes no sense, it will always be false
since the reference is to an alternative that cannot be
simultaneously be taken. Therefore, I think this should be
an error.

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2003-04-20 10:12

Message:
Logged In: YES 
user_id=7887

About the test cases, they're missing indeed. I can write
some while applying the patch.

About being experimental, IIRC, it is listed like
experimental in the Perl documentation for several years,
and will probably stay like this forever. :-) Anyway, IMO
this shouldn't affect our evaluation of the importance of
that feature for Python's sre.

About semantic restriction, do you mean check if the
backreference is lesser than the current group? Should be
doable. OTOH, I don't understand your example. In
"(X)|(?(1)Y)", there's no sense in using (?(1), as it will
always be false.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-04-20 08:23

Message:
Logged In: YES 
user_id=21627

I like the patch in principle, but I have a number of
additional concerns:
- there are no test cases
- the feature is declared experimental in perlre(1). Why?
- Shouldn't there be a semantic restriction that the back
  reference is only allowed if it points to a group that is
known
  to precede? I.e. is

  (X)|(?(1)Y)

  valid? If not, the restriction should be atleast documented, 
  but if possible, it should also be implemented.

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2003-04-19 23:30

Message:
Logged In: YES 
user_id=7887

That patch is around for a long time. Should I work on it,
fixing that problem, and apply it? Do you agree with the
feature inclusion?

I remember that the main reason for implementing this is
because it is hard to achieve the same results without it.
You have to rewrite the whole match twice inside an or'ed group
(e.g. "(<... match email ...>|... match email ...)").


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-04-19 10:50

Message:
Logged In: YES 
user_id=21627

If you add new opcodes, you should also change SRE_MAGIC.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=572936&group_id=5470