Where can i reference the "regular expressions"

Andre Engels andreengels at gmail.com
Wed Mar 24 05:50:37 EDT 2010


On Wed, Mar 24, 2010 at 10:34 AM, John Smithury <joho.smithury at gmail.com> wrote:
> ==============source============
> <line>the</line>
> <line>is</line>
> <line>name</line>
> ==============source end=========
>
> First, get the word only(discard the "<line>" and "</line>"), it can use
> regular expression, right?
>
> the
> is
> name
> Second, get a charactor in each word and compose like format {'t','h','e'}
>>>>for a in line
>
>
> Most import is learning the "regular expressions" var this example.

Okay, then I'll go into that part.

regex = re.compile("<line>([^<>]*)</line>")

[^<>] here means "any character but < or >"
* means that we have any number (zero or more) of such characters
The brackets mean that this is the part of the expression we are
interested in (the group)
The expression as a whole thus means:
First <line>, then the part we are interested in, which is a random
string of things that are not < or >, then </line>

To use this expression (assuming 'text' is the string you want to check:

result = regex.findall(text)

will find all occurences of the regular expression, and provide you
with the content of the group.


-- 
André Engels, andreengels at gmail.com



More information about the Python-list mailing list