regular expression
courtneyb
courtneyb at big-c.com
Thu Feb 24 10:12:33 EST 2000
I am writing a web interface. I need to grab a web page, and parse for
the content between the <pre> and </pre> tags. I figured I would use
regular expressions...below is the code:
#!/usr/bin/python
import httplib
import re
def getwebpage(HOST, PAGE):
h = httplib.HTTP(HOST)
h.putrequest('GET', PAGE)
h.putheader('Accept', 'text/html')
h.putheader('Accept', 'text/plain')
h.endheaders()
errcode, errmsg, headers = h.getreply()
f = h.getfile()
data = f.read() # Get the raw HTML
f.close()
if errcode == 200:
pattern = re.compile(r'<pre>(.*?)<\/pre>',re.S|re.I)
result = pattern.search(data)
return result
else:
return 'There was an error. CODE:' + errcode
print getwebpage('cristal.inria.fr', '/bin/ecdldb?
m=T;v=University+of+Kentucky;notab=on')
After running this I get...<re.MatchObject instance at 80f0ff0>
How do u return the content between the pre tags?
--
*************************************
courtneyb
courtneyb at big-c.com
More information about the Python-list
mailing list