
Hi Holger, Can you elaborate a bit on what makes lxml.objectify less suitable
for these cases?
In most cases, the XML is being processed by code that knows exactly what the schema of the document is and hence using lxml.objectify is a perfect match. However, in a few cases I'm writing code that's trying to generically process XML across documents whose schema is structurally similar but differs in detail. E.g., it might be processing a document that looks like: <X> <A/> <A/> <M/> <M/> <M/> <M/> </X> or: <X> <A/> <A/> <N/> <N/> <N/> <N/> </X> I.e., both are rooted with <X>, start with some number of <A> sub-elements followed some number of *either* <M> or <N> sub-elements. The processing code doesn't care about the structure of the <M> or <N> elements and in fact the documents it's processing could have any number of any single type of sub-element following the <A> sub-elements and not really care about their details. However, the code wants to return a list of objectified versions of those elements. Maybe there's a way to do all this entirely in "objectified mode", but I haven't figured it out. Interesting question. I don't know any obvious conversion method but
I'd say just go for serialization & re-parsing. It's an area where lxml usually shines speed-wise. Unless your performance/memory measurements tell you this is not the way to go for your use case, of course...
Yeah, I'm not too worried about the serialization/deserialization cost in my particular case. It just seemed a bit unclean and I was looking for something less unclean. Thanks. Nat