Parsing html links
Fredrik Lundh
fredrik at effbot.org
Wed Jan 24 06:02:21 EST 2001
"Joonas" wrote:
> How can I make a script im Python that changes all
> 'http://myaddress.spam ' into '<a
> href="http://myaddress.spam">http://myaddress.spam<a/>'
recently seen on comp.lang.python:
Subject: Re: URL replacement in text
Date: Tue, 16 Jan 2001
Steve Holden wrote:
> Note that this won't cope with some of the more pathological URLs (such as
> those with CGI arguments [http://system/cgi?arg1=val1] or those which link
> to a named anchor in the target page [http://system/pageref#target-name]).
import re
links = re.compile(r"(?i)(?:http|ftp|gopher):[\w.~/%#?&=-]+")
def fixlink(m):
href = m.group(0)
return "<a href='%s'>%s</a>" % (href, href)
sometext = """
here's another link: http://www.pythonware.com?FOO=1&bar=10#BAR
"""
print links.sub(fixlink, sometext)
Cheers /F
More information about the Python-list
mailing list