[lxml-dev] Simple doctypes not in docinfo.doctype
![](https://secure.gravatar.com/avatar/747bd7232d20d7883b7f3b99a16d0a6b.jpg?s=120&d=mm&r=g)
Hallo list I've hit a snag with lxml and a DOCTYPE decleration. I don't know if I'm to blame here, but would appreciate help either way. I've tried this with an old (1.3.2) and newer (2.0.6) lxml version. (this example is roughly based on the code at http://codespeak.net/lxml/tutorial.html) from lxml import etree from StringIO import StringIO tree = etree.parse(StringIO("""<!DOCTYPE TS><TS></TS>""")) tree.docinfo.doctype ''
From my understanding this DOCTYPE declaration is valid (and occurring in the wild in Qt .ts files). My real issue is round-trip problems in a reading-writing cycle where the DOCTYPE is lost, but I guess not being able to use .docinfo.doctype is already a problem.
Any help will be appreciated. Keep well Friedel -- Recently on my blog: http://translate.org.za/blogs/friedel/en/content/vrot-mango
![](https://secure.gravatar.com/avatar/8b97b5aad24c30e4a1357b38cc39aeaa.jpg?s=120&d=mm&r=g)
Hi, F Wolff wrote:
I've tried this with an old (1.3.2) and newer (2.0.6) lxml version.
(this example is roughly based on the code at http://codespeak.net/lxml/tutorial.html)
from lxml import etree from StringIO import StringIO tree = etree.parse(StringIO("""<!DOCTYPE TS><TS></TS>""")) tree.docinfo.doctype ''
From my understanding this DOCTYPE declaration is valid (and occurring in the wild in Qt .ts files). My real issue is round-trip problems in a reading-writing cycle where the DOCTYPE is lost, but I guess not being able to use .docinfo.doctype is already a problem.
I agree that better handling is desirable here. Could you file a bug report so that this doesn't get lost? (and so that you get notified on any further development). https://bugs.launchpad.net/lxml If you want to give it a try yourself, the DOCTYPE writing code is in src/lxml/serializer.pxi, function _writeDtdToBuffer(), the docinfo code is in lxml.etree.pyx, class DocInfo. Patches and test cases (src/lxml/tests/test_etree.py) are welcome. Thanks, Stefan
![](https://secure.gravatar.com/avatar/8b97b5aad24c30e4a1357b38cc39aeaa.jpg?s=120&d=mm&r=g)
Hi, Stefan Behnel wrote:
F Wolff wrote:
I've tried this with an old (1.3.2) and newer (2.0.6) lxml version.
(this example is roughly based on the code at http://codespeak.net/lxml/tutorial.html)
from lxml import etree from StringIO import StringIO tree = etree.parse(StringIO("""<!DOCTYPE TS><TS></TS>""")) tree.docinfo.doctype ''
From my understanding this DOCTYPE declaration is valid (and occurring in the wild in Qt .ts files). My real issue is round-trip problems in a reading-writing cycle where the DOCTYPE is lost, but I guess not being able to use .docinfo.doctype is already a problem.
I agree that better handling is desirable here. Could you file a bug report so that this doesn't get lost?
Ok, I fixed it anyway. Here's a patch. Stefan
![](https://secure.gravatar.com/avatar/747bd7232d20d7883b7f3b99a16d0a6b.jpg?s=120&d=mm&r=g)
On Sa, 2008-10-25 at 20:16 +0200, Stefan Behnel wrote:
Hi,
Stefan Behnel wrote:
F Wolff wrote:
I've tried this with an old (1.3.2) and newer (2.0.6) lxml version.
(this example is roughly based on the code at http://codespeak.net/lxml/tutorial.html)
from lxml import etree from StringIO import StringIO tree = etree.parse(StringIO("""<!DOCTYPE TS><TS></TS>""")) tree.docinfo.doctype ''
From my understanding this DOCTYPE declaration is valid (and occurring in the wild in Qt .ts files). My real issue is round-trip problems in a reading-writing cycle where the DOCTYPE is lost, but I guess not being able to use .docinfo.doctype is already a problem.
I agree that better handling is desirable here. Could you file a bug report so that this doesn't get lost?
Ok, I fixed it anyway. Here's a patch.
Stefan
Thank you Stefan! I haven't even gotten round to the bug report yet, and you already have it fixed! At the time I implemented a workaround, but I hope to test this issue with your proper fix soon. Thank you again. Friedel -- Recently on my blog: http://translate.org.za/blogs/friedel/en/content/its-easyer-with-kulula
participants (2)
-
F Wolff
-
Stefan Behnel