[Tutor] BeautifulSoup - getting cells without new line characters
jonasmg at softhome.net
jonasmg at softhome.net
Fri Mar 31 18:29:33 CEST 2006
Kent Johnson writes:
> jonasmg at softhome.net wrote:
>> Kent Johnson writes:
>>
>>
>>>jonasmg at softhome.net wrote:
>>>
>>>> From a table, I want to get the cells for then only choose some of them.
>>>>
>>>><table>
>>>><tr>
>>>><td>WY</td>
>>>><td>Wyo.</td>
>>>></tr>
>>>>...
>>>></table>
>>>>
>>>>Using:
>>>>
>>>>for row in table('tr'): print row.contents
>>>>
>>>> ['\n', <td>WY</td>, '\n', <td>Wyo.</td>, '\n']
>>>> [...]
>>>>
>>>>I get a new line character between each cell.
>>>>
>>>>Is possible get them without those '\n'?
>>>
>>>Well, the newlines are in your data, so you need to strip them or ignore
>>>them somewhere.
>>
>> I want only (for each row) to get some positions (i.e.
>> row.contents[0],row.contents[2])
>
> It sounds like you should just work with row('td') instead of
> row.contents. That will give you a list of just the <td> elements.
>
> Kent
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
You have reason but the problem is that some cells have anchors.
Sorry, I forgot myself to say it.
and using:
for row in table('tr'):
cellText = [cell.string for cell in row('td')]
print cellText
I get null values in cell with anchors.
More information about the Tutor
mailing list