Parsing html links

Joonas keisari_ at hotmail.com
Wed Jan 24 05:48:23 EST 2001


How can I make a script im Python that changes all
'http://myaddress.spam ' into '<a
href="http://myaddress.spam">http://myaddress.spam<a/>'

I have made a script that works, but it only changes the first address.

If it helps you in some languages it would be
msg =~ s/(\s)(http:\/\/\S+)(\s)/$1<a href=\"$2\">$2<\/a>$3/ig;


###############link.py################33
#! /usr/local/bin/python
import sys, regex, string

link_pattern = regex.compile('\(.*[\n
]\)\(http:\/\/[a-zA-Z0-9_/\.]+\)\([\n ]\).*')

line = """this is the first line
http://myaddr.spam
and python's http://www.python.org
the last line"""

line = " "+line+" "
if link_pattern.match(line) >= 0:
  regs = link_pattern.regs   # start/stop indexes
  a,  b  = regs[2]           #osoite  antaa osoitteen rajat
  a2, b2 = regs[1]           #ekarako
  a3, b3 = regs[3]           #tokarako
  osoite   = line[a:b]       #ottaa slicen
  alku     = line[:b2]
  loppu    = line[a3:]
  line = '"%s<a href="%s">%s</a>%s"' % (alku,osoite,osoite,loppu)

print line
#############link.py######################




Joonas.



More information about the Python-list mailing list