Parsing html with Beautifulsoup
Johann Spies
jspies at sun.ac.za
Mon Dec 14 01:58:34 EST 2009
On Sun, Dec 13, 2009 at 07:58:55AM -0300, Gabriel Genellina wrote:
> this code should serve as a starting point:
Thank you very much!
> cell.findAll(text=True) returns a list of all text nodes inside a
> <td> cell; I preprocess all \n and in each text node, and
> join them all. lines is a list of lists (each entry one cell), as
> expected by the csv module used to write the output file.
I have struggled a bit to find the documentation for (text=True).
Most of documentation for Beautifulsoup I saw mostly contained some
examples without explaining what the options do. Thanks for your
explanation.
As far as I can see there was no documentation installed with the
debian package.
Regards
Johann
--
Johann Spies Telefoon: 021-808 4599
Informasietegnologie, Universiteit van Stellenbosch
"But I will hope continually, and will yet praise thee
more and more." Psalms 71:14
More information about the Python-list
mailing list