Parsing html with Beautifulsoup

Johann Spies jspies at sun.ac.za
Fri Dec 11 08:04:38 CET 2009


Gabriel Genellina het geskryf:
> En Thu, 10 Dec 2009 06:15:19 -0300, Johann Spies <jspies at sun.ac.za> 
> escribió:
>
>> How do I get Beautifulsoup to render (taking the above line as
>> example)
>>
>> sunentint for <img src=icons/group.png>&nbsp;<a
>> href=#OBJ_sunetint>sunetint</A><BR>
>>
>> and still provide the text-parts in the <td>'s with plain text?
>
> Hard to tell if we don't see what's inside those <td>'s - please 
> provide at least a few rows of the original HTML table.
>
Thanks for your reply. 

Here are a few lines:

<!------- Rule 1 ------->
<tr style="background-color: #ffffff"><td class=normal>2</td><td><img 
src=icons/usrgroup.png>&nbsp;All Users at Any<br><td><im$
</td><td><img src=icons/any.png>&nbsp;Any<br></td><td><img 
src=icons/clientencrypt.png>&nbsp;clientencrypt</td><td><img src$
&nbsp;</td><td>&nbsp;</td></tr>

<!------- Rule 2 ------->
<tr style="background-color: #eeeeee"><td class=normal>3</td><td><img 
src=icons/any.png>&nbsp;Any<br><td><img src=icons/any$
&nbsp;</td><td>&nbsp;</td></tr>

<!------- Rule 3 ------->
<tr style="background-color: #ffffff"><td class=normal>4</td><td><img 
src=icons/group.png>&nbsp;<a href=#OBJ_Rainwall_Group$
<td><img src=icons/group.png>&nbsp;<a href=#OBJ_Rainwall_Group 
 >Rainwall_Group</A> <BR>
</td><td><img src=icons/udp.png>&nbsp;<a href=#SVC_RainWall_Stop 
 >RainWall_Stop</a><br></td><td><img src=icons/drop.png>&nb$
&nbsp;</td><td>&nbsp;</td></tr>

<!------- Rule 4 ------->
<tr style="background-color: #eeeeee"><td class=normal>5</td><td><img 
src=icons/host.png>&nbsp;<a href=#OBJ_Rainwall_Broadc$
<img src=icons/group.png>&nbsp;<a href=#OBJ_Rainwall_Group 
 >Rainwall_Group</A> <BR>
<td><img src=icons/group.png>&nbsp;<a href=#OBJ_Rainwall_Group 
 >Rainwall_Group</A> <BR>
<img src=icons/host.png>&nbsp;<a href=#OBJ_Rainwall_Broadcast 
 >Rainwall_Broadcast</A> <BR>
</td><td><img src=icons/udp.png>&nbsp;<a href=#SVC_RainWall_Daemon 
 >RainWall_Daemon</a><br></td><td><img src=icons/accept.p$
&nbsp;</td><td>&nbsp;</td></tr>

Regards
Johann

-- 
Johann Spies          Telefoon: 021-808 4599
Informasietegnologie, Universiteit van Stellenbosch

     "Lo, children are an heritage of the LORD: and the  
      fruit of the womb is his reward."        Psalms 127:3 





More information about the Python-list mailing list