an odd result with addprevious
data:image/s3,"s3://crabby-images/d5859/d5859e89788ed2836a0a4ecbda4a1f9d4a69b9e7" alt=""
I see my mistake: I assumed that the text string was part of the <l> element, but it’s the “tail” of the pb element, and the l element appears very properly as an empty element. This is a part of lxml that I need to wrap my head around I am trying to mov e <pb/> elements in a TEI document from a position at the top of a block to a position between the block elements. So I thought up a toy problem, using the following text: <root> <l>Mary had a little lamb, little lamb,</l> <l><pb/>little lamb, Mary had a little lamb</l> <l>whose fleece was white as snow.</l> <l>And everywhere that Mary went</l> <l>Mary went, Mary went, everywhere</l> <l>that Mary went</l> <l> The lamb was sure to go.</l> </root> I used the following code: from lxml import etree Mary = '/users/martin/dropbox/allerlei/Mary.xml' tree = etree.parse(Mary) for pb in tree.iter('pb'): parent = pb.getparent() print(parent.tag) parent.addprevious(pb) print(etree.tostring(tree, encoding='unicode')) This produces the following output: l <root> <l>Mary had a little lamb, little lamb,</l> <pb/>little lamb, Mary had a little lamb<l/> <l>whose fleece was white as snow.</l> <l>And everywhere that Mary went</l> <l>Mary went, Mary went, everywhere</l> <l>that Mary went</l> <l> The lamb was sure to go.</l> </root> The output does most of the things right and moves the pb element from child of l to child of root. But the output doesn’t show the opening tag for the l element, so that the document is not a well-formed document. Oddly enough, this code works as intended in a TEI text were all text is wrapped in <w> elements. Since that is the kind of text I’m working with, I’m happy enough. But I am curious about the dropping of the opening tag of the l element. Is that a bug, am I doing something wrong, or do I just not know enough about “mixed XML”.
participants (1)
-
Martin Mueller