[Tutor] Beautiful Soup, inserting a node?

Kent Johnson kent37 at tds.net
Fri Dec 2 11:55:23 CET 2005


Bob Tanner wrote:
> Kent Johnson wrote:
> 
> 
>>>Is there a way to insert a node with Beautiful Soup?
>>
>>BS doesn't really seem to be set up to support this. The Tags in a soup
>>are kept in a linked
> 
> 
> What would the appropriate technology to use? 

Fredrik Lundh's elementtidy uses the Tidy library to read (and clean up) HTML into an ElementTree representation which you can then modify and write.

Here is a recipe that claims to be useful for modifying HTML pages:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/286269

These articles have more suggestions:
http://www.xml.com/pub/a/2004/09/08/pyxml.html
http://www.boddie.org.uk/python/HTML.html

Note that any process you use to convert HTML to an XML representation and then write out will not in general preserve the exact text of the original HTML - it will convert it to XHTML. This may or may not be desirable. If exact fidelity to the original document is important probably the cookbook recipe is the only one of these that will work.

Depending on what kind of modifications you are making a simple regex-based text processing approach might work also - don't parse the text as HTML, just look for the part you need to change and make the change directly.

Kent
-- 
http://www.kentsjohnson.com



More information about the Tutor mailing list