Is there a HTML parser who can reconstruct the original html EXACTLY?
paul at boddie.org.uk
Wed Jan 23 15:07:43 CET 2008
On 23 Jan, 14:20, kliu <ios... at gmail.com> wrote:
> Thank u for your reply. but what I really need is the mapping between
> each DOM nodes and the corresponding original source segment.
At the risk of promoting unfashionable DOM technologies, you can at
least serialise fragments of the DOM in libxml2dom :
d = libxml2dom.parseURI("http://www.diveintopython.org/", html=1)
Storage and retrieval of the original line and offset information may
be supported by libxml2, but such information isn't exposed by
More information about the Python-list