[issue20714] Allow for ]]> in CDATA in minidom

Artur R. Czechowski report at bugs.python.org
Fri Feb 21 11:14:53 CET 2014


Artur R. Czechowski added the comment:

Eric, I'm not sure what exactly your concern is, but I'll try to address two issues I can see.

First: both strings <![CDATA[]]]]> and <![CDATA[>]]> are a correct and valid examples of CDATA usage as per specification[1].

Second: is it allowed to have two occurences of CDATA inside one element? The same specification says only that ‟CDATA sections may occur anywhere character data may occur”. There is nothing said if multiple occurrences are allowed or disallowed.
Wikipedia suggests in [2] that it is OK, giving the same example of embedding ]]> inside CDATA. There is no hints in Talk page that this solution doesn't work for someone.
In other example [3] there is explicitly stated that: ‟the [...] application shouldn't care about the difference between abc and <![CDATA[abc]]> and <![CDATA[a]]><![CDATA[bc]]>”.

Last but not least: using following schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="foo"/>
</xs:schema>

following XML file:

<?xml version="1.0" ?>
<foo>
<![CDATA[]]]]><![CDATA[>]]>
</foo>

validates correctly with xmllint:
$ xmllint -noout --schema schema.xsd t.xml
t.xml validates

I hope it dissolves your concerns.

PS. I noticed I missed one ] in provided patch. There should be four of them in second parameter of replace.

[1] http://www.w3.org/TR/REC-xml/#sec-cdata-sect
[2] http://en.wikipedia.org/wiki/CDATA#Nesting
[3] http://oxygenxml.com/archives/xsl-list/200502/msg00787.html

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue20714>
_______________________________________


More information about the Python-bugs-list mailing list