[Tutor] Re: Regular Expression Help

Andrei project5 at redrival.net
Tue Sep 30 07:56:29 EDT 2003


Jacob Abraham wrote:
> Dear Tutor,

Hi,

>    I am having some trouble with my regular
> expressions.
> Here is my problem.
> 1. Find the word 'Required_Text:'
> 2. Find All characters till
>     a.A new line character or
>     b.The word 'Required_'
> 
>    These were my failed attempts 

Here's a regex which matches the "Required_Text" up till the next newline or 
"Required_" (uses a negative lookahead assertion, that (?=...) stuff):

     "Required_Text: (.)*(?=\n|Required_)"

Running this on:

"""Some Text Required_Text: Some other text and
(23.32) numbers Required_Field: Some more text
Required_Text: Some other text and (23.32) numbersRequired_Field: Some more text"""

will return:

    0: "Required_Text: Some other text and"
    1: "Required_Text: Some other text and (23.22) numbers"

It suits the requirements you gave (stop at either newline or Required, 
whichever comes first), but it seems a bit weird that you don't want both 
matches to be the same.

You could also slightly modify the regex to:

     "Required_Text: ((.)*(?=\n|Required_))"

This forms a group (with index 1) containing the stuff that is required.

And you could build on that:

     "Required_Text: ((.)*(?=\n|Required_))(Required_){0,1}"

which returns instead of the previous 1:

     1: "Required_Text: Some other text and (23.22) numbers Required_"

FWIW, Spe (http://spe.pycs.net) includes a plugin called Kiki (written by yours 
truly :)) which visualizes the results of regexes (groups and the likes - at 
least, if he's already included my latest version in the download). You could 
also look into the Tkinter-based recon.py (not sure where you can get it, but 
Vaults of Parnassus or Google are your friends) if you mind wxPython-based 
solutions, but it's not quite as featureful.

-- 
Yours,

Andrei

=====
Mail address in header catches spam. Real contact info (decode with rot13):
cebwrpg5 at bcrenznvy.pbz. Fcnz-serr! Cyrnfr qb abg hfr va choyvp cbfgf. V ernq gur 
yvfg, fb gurer'f ab arrq gb PP.





More information about the Tutor mailing list