[Tutor] Re: Regex
Andrei
project5 at redrival.net
Mon Aug 25 21:01:58 EDT 2003
Thanks, it *almost* helps, but I'm not trying to harvest the links. The
issue is that I do *not* want to get URLs if they're in between <a>
tags, nor if they're an attribute to some tag (img, a, link, whatever).
Perhaps I should have explained my goal more clearly: I wish to take a
piece of text which may or may not contain HTML tags and turn any piece
of text which is NOT a link, but is an URL into a link. E.g.:
go to <a href="http://home.com">http://home.com</a>. [1]
go <a href="http://home.com">home</a>. [2]
should remain unmodified, but
go to http://home.com [3]
should be turned into [1]. That negative lookbehind can do the job in
the large majority of the cases (by not matching URLs if they're
preceded by single or double quotes or by ">"), but not always since it
doesn't allow the lookbehind to be non-fixed length. I think one of the
parser modules might be able to help (?) but regardless of how much I
try, I can't get the hang of them, while I do somewhat understand regexes.
Andrei
More information about the Tutor
mailing list