Algorithm for finding and replacing URLs with hyperlinks

sismex01 at hebmex.com sismex01 at hebmex.com
Wed Feb 19 14:56:51 EST 2003


> From: Morten W. Petersen [mailto:morten at nidelven-it.no]
> Sent: Wednesday, February 19, 2003 1:48 PM
> 
> Hi!
> 
> Does anyone have a snippet of code that'll replace for example 
> http://www.python.org with <a href="http://www.python.org"
>  >http://www.python.org</a>?
> 
> Thanks,
> 
> Morten
> 
> -- 
> http://mail.python.org/mailman/listinfo/python-list

First, you need a (simple?) regex to find URL-like strings.
This'll do in a pinch, just add protocols:

>>> protocols = ['http','ftp','mailto','gopher', ...]
>>> rxURL = re.compile(r'(http://\S+)', re.I|re.S)

Then, you need to attack your text with this regular
expression, replacing all instances of the found
text with it's replacement.  A little function
can give you the result you want:

>>> def replacement(url):
...     return '<a href="%s">%s</a>' % (url,url)

Now, we use this little function as a callback to
the regex's .subn() method.  Assume 'TEXT' is
your original text:

>>> TEXT2 = rxURL.subn(replacement, TEXT)[1]

and voila.  I think that's how it's done.

HTH

-gustavo





More information about the Python-list mailing list