Help with parsing web page

wes weston wweston at att.net
Tue Jun 15 18:32:12 CEST 2004


RiGGa wrote:
> Hi,
> 
> I want to parse a web page in Python and have it write certain values out to
> a mysql database.  I really dont know where to start with parsing the html
> code ( I can work out the database part ).  I have had a look at htmllib
> but I need more info. Can anyone point me in the right direction , a
> tutorial or something would be great.
> 
> Many thanks
> 
> RiGga
> 

RiGga,
    If you want something, hopefully, not too simple. Frequently, you can
strip out the html and the resulting list will have a label followed by
the piece of data you want to save.
    Do you need mysql code?
wes



def RemoveLessThanGreaterThanSectionsTokenize( s ):
     state = 0
     str   = ""
     list  = []
     for ch in s:
         #grabbing good chars state
         if state == 0: # s always starts with '<'
             if ch == '<':
                 state = 1
                 if len(str) > 0:
                     list.append(str)
                     str = ""
             else:
                 str += ch
         #dumping bad chars state
         elif state == 1: # looking for '>'
             if ch == '>':
                 state = 0
     return list




More information about the Python-list mailing list