[Tutor] Re: Need help with multi-line regex identification
Jorge Godoy
godoy at ieee.org
Wed Apr 21 07:13:02 EDT 2004
On Qua 21 Abr 2004 03:54, Tony Cappellini wrote:
> Could someone help point me in the right direction for this ?
In Perl we used to use multiline/extended match for that.
Either we made the line terminator become irrelevant and handled everything
as if it was in one line (something like that, but not exactly that) or we
used some extensions on grep (you can read Perl's documentation on regular
expressions for that: perldoc perlre). It became something along the lines
of what's in this part of the docs:
---------------------------------------------------------------------
m Treat string as multiple lines. That is, change "^" and "$" from
matching the start or end of the string to matching the start or
end of any line anywhere within the string.
s Treat string as single line. That is, change "." to match any
character whatsoever, even a newline, which normally it would not
match.
The "/s" and "/m" modifiers both override the $* setting. That
is, no matter what $* contains, "/s" without "/m" will force "^"
to match only at the beginning of the string and "$" to match
only at the end (or just before a newline at the end) of the
string. Together, as /ms, they let the "." match any character
whatsoever, while still allowing "^" and "$" to match,
respectively, just after and just before newlines within the
string.
---------------------------------------------------------------------
I mean the regexp became "/<something>/ms".
Even there, where regexps are highly recommended for lots of things we find
problems with such things and there's a "more recommended" approach: using
a lexical parser.
So, this is what I'm questioning you: wouldn't it be a lot easier for you to
change the language, expand it and also parse it if you had a parser for
it?
On a short search through Google ("python lexical parser") I found this:
http://christophe.delord.free.fr/en/tpg/
There's also the parser-sig (whose page links to
http://www.python.org/topics/parsing.html) where you can get other options
if the above doesn't satisfy your needs.
For using Plex, there's even an example where the author is handling
"comments" on code. It might interest you better. The docs are at
http://www.cosc.canterbury.ac.nz/~greg/python/Plex/version/doc/index.html
and the page where the parser-sig points to is at
http://www.cosc.canterbury.ac.nz/~greg/python/Plex/
Take a look at the other ones too... And use a parser. It will be better and
easier, IMHO.
Be seeing you,
--
Godoy. <godoy at ieee.org>
More information about the Tutor
mailing list