Parsing html with Beautifulsoup

Johann Spies jspies at sun.ac.za
Mon Dec 14 01:58:34 EST 2009


On Sun, Dec 13, 2009 at 07:58:55AM -0300, Gabriel Genellina wrote:

> this code should serve as a starting point:

Thank you very much!

> cell.findAll(text=True) returns a list of all text nodes inside a
> <td> cell; I preprocess all \n and   in each text node, and
> join them all. lines is a list of lists (each entry one cell), as
> expected by the csv module used to write the output file.

I have struggled a bit to find the documentation for (text=True).
Most of documentation for Beautifulsoup I saw mostly contained some
examples without explaining what the options do.  Thanks for your
explanation. 

As far as I can see there was no documentation installed with the
debian package.

Regards
Johann
-- 
Johann Spies          Telefoon: 021-808 4599
Informasietegnologie, Universiteit van Stellenbosch

     "But I will hope continually, and will yet praise thee 
      more and more."                  Psalms 71:14 



More information about the Python-list mailing list