New subject: [lxml-dev] CDATA and lxml

11 Apr 2008

      Silfheed wrote:
...
So first off I know that CDATA is generally hated and just shouldn't
be done, but I'm simply required to parse it and spit it back out.
Parsing is pretty easy with lxml, but it's the spitting back out
that's giving me issues.  The fact that lxml strips all the CDATA
stuff off isnt really a big issue either, so long as I can create
CDATA blocks later with <>&'s showing up instead of <>& .
I've scoured through the lxml docs, but probably not hard enough, so
anyone know the page I'm looking for or have a quick how to?
There's nothing in the docs because lxml doesn't allow you to create CDATA
sections. You're not the first one asking that, but so far, no one really had
a take on this.

It's not as trivial as it sounds. Removing the CDATA sections in the parser is
just for fun. It simplifies the internal tree traversal and text aggregation,
so this would be affected if we allowed CDATA content in addition to normal
text content. It's not that hard, it's just that it hasn't been done so far.

Stefan

Re: [lxml-dev] CDATA and lxml

Stefan Behnel

Stefan Behnel

Stefan Behnel

tags

participants (1)