[Tutor] Newbie question - syntax - BeautifulSoup

Tommy Kaas tommy.kaas at kaasogmulvad.dk
Wed Jul 28 18:17:50 CEST 2010


I have just begun a struggle learning Python. I have read most of "Beginning
Python - from Novice to Professional" - and some of it I even understood J

This is my first question to the list. And I'm sure not the last.

 

I'm especially interested in learning web scraping techniques and here:
http://stackoverflow.com/questions/2081586/web-scraping-with-python I found
a small example:

 

import urllib2 

from BeautifulSoup import BeautifulSoup 

 

soup =
BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/worldclock/astrono
my.html?n=78').read()) 

 

for row in soup('table', {'class' : 'spad'})[0].tbody('tr'): 

  tds = row('td') 

  print tds[0].string, tds[1].string 

  # will print date and sunrise

 

 

 

The example works fine, and I can change it a bit and it still works. But I
simply don't understand how I am supposed to the fourth line - after "for
row in soup". I can clearly see it defines the area I want to scrape, but
how is the syntax build? And almost as important - where should I have found
that information myself? I have tried to read the help-file of
BeautifulSoup, but found nothing there.

 

Thanks in advance.

 

 

Tommy Kaas

Journalist

Kaas & Mulvad

 

Copenhagen, Denmark

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20100728/749d2d01/attachment.html>


More information about the Tutor mailing list