Help with parsing web page
wes weston
wweston at att.net
Tue Jun 15 12:32:12 EDT 2004
RiGGa wrote:
> Hi,
>
> I want to parse a web page in Python and have it write certain values out to
> a mysql database. I really dont know where to start with parsing the html
> code ( I can work out the database part ). I have had a look at htmllib
> but I need more info. Can anyone point me in the right direction , a
> tutorial or something would be great.
>
> Many thanks
>
> RiGga
>
RiGga,
If you want something, hopefully, not too simple. Frequently, you can
strip out the html and the resulting list will have a label followed by
the piece of data you want to save.
Do you need mysql code?
wes
def RemoveLessThanGreaterThanSectionsTokenize( s ):
state = 0
str = ""
list = []
for ch in s:
#grabbing good chars state
if state == 0: # s always starts with '<'
if ch == '<':
state = 1
if len(str) > 0:
list.append(str)
str = ""
else:
str += ch
#dumping bad chars state
elif state == 1: # looking for '>'
if ch == '>':
state = 0
return list
More information about the Python-list
mailing list