[lxml-dev] should _setElementValue add type attributes?
Hi, I discussed this with Stefan before and I'm anxious to know if this is the way to go (maybe as switchable behaviour), removing the need for a beast like the discussed PT() factory, as well as making type behaviour arguably more "straightforward", at the cost of auto-adding py:pytype attributes: # _setElementValue implementation that auto-adds type(RVAL).__name__ as # py:pytype cdef _setElementValue(_Element element, value): if value is None: cetree.setAttributeValue( element, XML_SCHEMA_INSTANCE_NIL_ATTR, "true") elif isinstance(value, _Element): _replaceElement(element, value) else: cetree.delAttributeFromNsName( element._c_node, _XML_SCHEMA_INSTANCE_NS, "nil") if not python._isString(value): pytype_name = type(value).__name__ if isinstance(value, bool): value = _lower_bool(value) else: value = str(value) else: pytype_name = "str" cetree.setAttributeValue(element, PYTYPE_ATTRIBUTE, pytype_name) cetree.setNodeText(element._c_node, value) I'm +1 for that. By making it switchable we could cater for those who don't care about the types that much but who do not want to see any non-explicitly created attributes. Holger -- Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
jholg@gmx.de wrote:
I discussed this with Stefan before and I'm anxious to know if this is the way to go (maybe as switchable behaviour), removing the need for a beast like the discussed PT() factory, as well as making type behaviour arguably more "straightforward", at the cost of auto-adding py:pytype attributes: [_setElementValue implementation that auto-adds type(RVAL).__name__ as py:pytype] I'm +1 for that.
Actually, you were the one who proposed it in the first place, so there's nothing to add to. :)
By making it switchable we could cater for those who don't care about the types that much but who do not want to see any non-explicitly created attributes.
I dislike the idea of adding a switch here. We already add pytype attributes in a couple of places, so people who do not like it will have to deannotate() their XML anyway (or not use objectify...). I think that always adding a pytype will give us more predictable behaviour. On the other hand, we could just check if the pytype the type inference mechanism returns is the type of the value, and only add the attribute if that is not the case. What do you think? It would not work if you exchange annotated data with other machines that use different setups, but if you do that, you'd probably annotate everything by hand anyway. Stefan
Hi,
I discussed this with Stefan before and I'm anxious to know if this is the way to go (maybe as switchable behaviour), removing the need for a beast like the discussed PT() factory, as well as making type behaviour arguably more "straightforward", at the cost of auto-adding py:pytype attributes: [_setElementValue implementation that auto-adds type(RVAL).__name__ as py:pytype] I'm +1 for that.
Actually, you were the one who proposed it in the first place, so there's nothing to add to. :)
Yes, but I admit I was unsure then if this muddies the API by making
root = objectify.Element("root") root.x = "3" behave differently from root = objectify.fromstring("""<root><x>3</x></root>""")
Kind of losing sort of a symmetry. But then again, we actually *do* have more information in the first case, namely the python type, so we should use it. Now I think that practicality beats purity here.
By making it switchable we could cater for those who don't care about the types that much but who do not want to see any non-explicitly created attributes.
I dislike the idea of adding a switch here. We already add pytype attributes in a couple of places, so people who do not like it will have to deannotate() their XML anyway (or not use objectify...).
Right, there's also TREE attributes and stuff.
I think that always adding a pytype will give us more predictable behaviour. On the other hand, we could just check if the pytype the type inference mechanism returns is the type of the value, and only add the attribute if that is not the case. What do you think? It would not work if you exchange annotated data with other machines that use different setups, but if you do that, you'd probably annotate everything by hand anyway.
I'd rather always add the pytype, then. I just think this is simpler. And if you want to exchange data with other machines, better xsiannotate() to fall back to XML standard types, or deannotate() and rely on type inference. Holger -- Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kanns mit allen: http://www.gmx.net/de/go/multimessenger
jholg@gmx.de wrote:
I admit I was unsure then if this muddies the API by making
root = objectify.Element("root") root.x = "3" behave differently from root = objectify.fromstring("""<root><x>3</x></root>""")
Kind of losing sort of a symmetry.
I can't see much of a symmetry there anyway. I'm more concerned about putting in "3" and getting back the number 3, than putting in "<value>3</value>" and getting back a number. The latter sounds natural to me.
I think that always adding a pytype will give us more predictable behaviour. On the other hand, we could just check if the pytype the type inference mechanism returns is the type of the value, and only add the attribute if that is not the case. What do you think? It would not work if you exchange annotated data with other machines that use different setups, but if you do that, you'd probably annotate everything by hand anyway.
I'd rather always add the pytype, then. I just think this is simpler. And if you want to exchange data with other machines, better xsiannotate() to fall back to XML standard types, or deannotate() and rely on type inference.
Sure. So be it. :) (... for lxml 2.0, that is) Stefan
jholg@gmx.de wrote:
root = objectify.Element("root") root.x = "3" behave differently from root = objectify.fromstring("""<root><x>3</x></root>""")
Kind of losing sort of a symmetry.
What bothers me more (and where I do see a symmetry) is:
root = objectify.fromstring("<root><flag>true</flag></root>")
# now root.flag True root.flag = "true" root.flag True
# then root.flag True root.flag = "true" root.flag 'true'
I'm not sure what to think about that. It would be wrong to special case it, but it kinda feels wrong the way it would work in the future... Stefan
Hi Stefan,
jholg@gmx.de wrote:
root = objectify.Element("root") root.x = "3" behave differently from root = objectify.fromstring("""<root><x>3</x></root>""")
Kind of losing sort of a symmetry.
What bothers me more (and where I do see a symmetry) is:
root = objectify.fromstring("<root><flag>true</flag></root>")
# now root.flag True root.flag = "true" root.flag True
# then root.flag True root.flag = "true" root.flag 'true'
I'm not sure what to think about that. It would be wrong to special case it, but it kinda feels wrong the way it would work in the future...
Hm, not for me (any more :). I think this is just the same case as having a literal 3 in the XML document. When parsing XML from a string or a file with no type information whatsoever, there is really only 2 things we can do: 1. Make strings of everything. 2. Use type-inference provided by the lookup mechanisms. (1) does not make much sense as we would not really need objectify at all (except for the syntactic sugar of its __setattr__-API). On the other hand, when setting elements by hand, i.e. in Python code, we well know the (python-)type information: For me, it begins to rather feel more natural to do:
# then root.flag = True # real live python boolean object root.flag True root.flag.text "true"
instead of
# now root.flag = "true" root.flag True
which is, in the end, pretty much the same as
# now root.three = "3" root.three 3
So, let's go for the auto-pytype-addition in _setElementValue, without special-casing, imo. Holger -- Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kanns mit allen: http://www.gmx.net/de/go/multimessenger
Hi Holger, jholg@gmx.de wrote:
For me, it begins to rather feel more natural to do:
# then root.flag = True # real live python boolean object root.flag True root.flag.text "true"
instead of
# now root.flag = "true" root.flag True
which is, in the end, pretty much the same as
# now root.three = "3" root.three 3
So, let's go for the auto-pytype-addition in _setElementValue, without special-casing, imo.
Fine, no special casing here. One more thing, though: we shouldn't store Python type hints that were not registered as their instantiation wouldn't work anyway. So I added a lookup before the attribute setter call. So, the new rules are: - what you put in comes back out (as long as the type is registered) - for non-annotated XML data, type inference is used to determine the return type (which may be ambiguous in some cases). Simple enough, I'd say. Stefan
participants (2)
-
jholg@gmx.de
-
Stefan Behnel