regular expression

courtneyb courtneyb at big-c.com
Thu Feb 24 10:12:33 EST 2000


I am writing a web interface.  I need to grab a web page, and parse for 
the content between the <pre> and </pre> tags.  I figured I would use 
regular expressions...below is the code:

#!/usr/bin/python

import httplib
import re

def getwebpage(HOST, PAGE):
        h = httplib.HTTP(HOST)
        h.putrequest('GET', PAGE)
        h.putheader('Accept', 'text/html')
        h.putheader('Accept', 'text/plain')
        h.endheaders()
        errcode, errmsg, headers = h.getreply()
        f = h.getfile()
        data = f.read() # Get the raw HTML
        f.close()
        if errcode == 200:
                pattern = re.compile(r'<pre>(.*?)<\/pre>',re.S|re.I)
                result = pattern.search(data)
                return result
        else:
                return 'There was an error. CODE:' + errcode
                
                
print getwebpage('cristal.inria.fr', '/bin/ecdldb?
m=T;v=University+of+Kentucky;notab=on')


After running this I get...<re.MatchObject instance at 80f0ff0>

How do u return the content between the pre tags?

-- 
*************************************
courtneyb
courtneyb at big-c.com



More information about the Python-list mailing list