Perl Range operator paradigm

Magnus Lyckå magnus at thinkware.se
Wed Oct 24 12:58:41 EDT 2001


Den Tue, 16 Oct 2001 12:27:26 +0200, skrev Laurent Pierron <Laurent.Pierron at loria.fr>:
> Hello Gurus,
> 
> In Perl there is a very powerful range operator '..', which, in scalar
> context, is acting like line-range (,) operator in sed and awk.
> 
> By example, this small program prints all the lines in the file included
> between <BODY>...</BODY> :
> 
> while (<>) {
>    if (/<BODY>/ .. /<\/BODY>/) {print; }
> }
> 
> How can I do, the same thing in Python ?

If I understand this correctly, it's some kind of regular expression.
/something/../otherthing/ is a non greedy match starting with
something and ending with otherthing.

A Python regular expression acts on a string, I don't think
it can be contorted into keeping state while it's being
fed a line at a time, but hey, we have virtual memory, right?

Regular expressions are not a language element in Python but
a library, just like so many else, so we start with

import re, sys

Apart from re (regular expressions) we need sys
to access stdin (<>).
Then we construct our regular expression

bodymatch = re.compile(r'<body>(.*?)</body>',re.S|re.I)

I'm not sure if this is exactly what the perl script finds.
I'm not including the <body> and </body> tags, that
would require you to write r'(<body>.*?</body>)'
instead. re.S means that dot (.) will include also
line breaks. re.I means ignore case, which is probably
what you want with HTML.

for line in bodymatch.findall(sys.stdin.read()):
    print line

So, instead of three lines of code, you get four, and it
looks more complicated (if you know perl), but on the
other hand, if you had started with a Python example
and asked "how do I do this in Perl" it might turn out
in the same way.

I don't know this perl concept, and it might be that
these programs don't do quite the same thing, but I
think that's just a matter of details in the regular expression.




More information about the Python-list mailing list