try using lxml ... its very useful <br><br><div class="gmail_quote">On Sat, Dec 11, 2010 at 11:24 AM, Martin Kaspar <span dir="ltr"><<a href="mailto:martin.kaspar@campus-24.com">martin.kaspar@campus-24.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Hello commnity<br>
<br>
i am new to Python and to Beatiful Soup also!<br>
It is told to be a great tool to parse and extract content. So here i<br>
am...:<br>
<br>
I want to take the content of a <td>-tag of a table in a html<br>
document. For example, i have this table<br>
<br>
<table class="bp_ergebnis_tab_info"><br>
<tr><br>
<td><br>
This is a sample text<br>
</td><br>
<br>
<td><br>
This is the second sample text<br>
</td><br>
</tr><br>
</table><br>
<br>
How can i use beautifulsoup to take the text "This is a sample text"?<br>
<br>
Should i make use<br>
soup.findAll('table' ,attrs={'class':'bp_ergebnis_tab_info'}) to get<br>
the whole table.<br>
<br>
See the target <a href="http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323" target="_blank">http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323</a><br>
<br>
Well - what have we to do first:<br>
<br>
The first thing is t o find the table:<br>
<br>
i do this with Using find rather than findall returns the first item<br>
in the list<br>
(rather than returning a list of all finds - in which case we'd have<br>
to add an extra [0]<br>
to take the first element of the list):<br>
<br>
<br>
table = soup.find('table' ,attrs={'class':'bp_ergebnis_tab_info'})<br>
<br>
Then use find again to find the first td:<br>
<br>
first_td = soup.find('td')<br>
<br>
Then we have to use renderContents() to extract the textual contents:<br>
<br>
text = first_td.renderContents()<br>
<br>
... and the job is done (though we may also want to use strip() to<br>
remove leading and trailing spaces:<br>
<br>
trimmed_text = text.strip()<br>
<br>
This should give us:<br>
<br>
<br>
print trimmed_text<br>
This is a sample text<br>
<br>
as desired.<br>
<br>
<br>
What do you think about the code? I love to hear from you!?<br>
<br>
greetings<br>
matze<br>
<font color="#888888">--<br>
<a href="http://mail.python.org/mailman/listinfo/python-list" target="_blank">http://mail.python.org/mailman/listinfo/python-list</a><br>
</font></blockquote></div><br><br clear="all"><br>-- <br>Nitin Pawar<br><br>