Hi Holger, finally coming back to this. jholg@gmx.de wrote:
I just noticed that annotate() does not add type information to empty string elements when parsed: I know this happens due to the .text of the node being None in the 1st case instead of '' in the second case (which is lxml/ElementTree/libxml2 behaviour that bites me once and again). Still, I'd prefer to have annotate() provide all data elements with type information; after all, the element in question is treated as a StringElement (the default emtpy_data_class) anyway.
Well, since the default class is definable in the lookup class, I prefer having it definable in annotate(), too. So I would add an "empty_type" keyword argument where you can provide the Python type name as string. Question: should this default to None or to "str"? I have no real preference myself, though "str" would mean we annotate everything by default, so that sounds better. Stefan