Beautiful Soup iterator question....

cjl cjlesh at gmail.com
Fri Apr 20 20:36:08 CEST 2007


P:

I am screen-scraping a table. The table has an unknown number of rows,
but each row has exactly 8 cells.  I would like to extract the data
from the cells, but the first three cells in each row have their data
nested inside other tags.

So I have the following code:

for row in table.findAll("tr"):
    for cell in row.findAll("td"):
        print cell.contents[0]

This code prints out all the data, but of course the first three cells
still contain their unwanted tags.

I would like to do something like this:

for cell1, cell2, cell3, cell4, cell5, cell6, cell7, cell8 in
row.findAll("td"):

Then treat each cell differently.

I can't figure this out. Can anyone point me in the right direction?

-CJL




More information about the Python-list mailing list