is there a way of getting around endless repetitions of "is not None" ?
This may be a primitive Python question rather than an lxml question, but I’ll ask it anyhow. If I want to find out whether the attribute of some element has a particular value, I can’t just say If element.get(‘rend’) == ‘hi ‘ This may throw an “is None” exception. I can get around it by saying If element.get(‘rend’) is not None and element.get(‘rend’) == ‘hi’ Is there some way of getting around these tedious “is not None” explicitations or do I always have to say it. I know that computers are very literal animals.
On Thu, 22 Dec 2016 02:20:52 +0000
Martin Mueller
This may be a primitive Python question rather than an lxml question, but I’ll ask it anyhow.
If I want to find out whether the attribute of some element has a particular value, I can’t just say
If element.get(‘rend’) == ‘hi ‘
This may throw an “is None” exception. I can get around it by saying
If element.get(‘rend’) is not None and element.get(‘rend’) == ‘hi’
Is there some way of getting around these tedious “is not None” explicitations or do I always have to say it. I know that computers are very literal animals.
What's the problem of using a dead-simple helper? def attr_of(el, name): v = el.get(name) if v is None: return '' ... if attr_of(element, 'rend') == 'hi': pass or def attr_is(el, name, val): v = el.get(name) if v is None: return False return v == val ... if attr_is(element, 'rend', 'hi'): pass I'm afraid you might have that OO-instilled fear of using non-methods on objects ;-) Fear not -- they are just fine. Alternatively you can use still supported 'attrib' dict-like object to access attributes on elements: it will throw a KeyError exception exception at you when you try to acces an attribute which does not exist: if element.attrib['rend'] == 'hi': pass # might blow up at you Well, there's also another approach which is typically used (I think) to parse complex XML documents: custom Python XML element classes. They are detailed in [1] but the basic idea is that you write a set of classes representing elements in the documents you parse, make an instance of lxml.etree.XMLParser(), arm it with the knowledge of how to look your custom classes up and then in those classes you make properties representing your attributes and child elements of interest, like in class Foo(etree.ElementBase): @property def rend(self): v = self.get('rend') if v is None: raise ValueError('Attribute not found: rend') return v And then after parsing your "element" variable will have (if you supposedly arranger for this properly) type Foo, and you could just do if element.rend == 'hi': pass # Will blow up with ValueError Of course, in a real-world implementation you'd use some generic code to access attributes and throw exceptions when they are not found -- to reduce the necessary boilerplate in the implementation. P.S. Well, I'm personally dreaming of something like [2] for this lxml's facility so I could just declare the necessary properties by just spelling their names, types (element vs attribute) and whether they must exist or not. 1. http://lxml.de/element_classes.html 2. https://attrs.readthedocs.io/en/stable/
Hi Martin,
If I want to find out whether the attribute of some element has a particular value, I can’t just say
If element.get(‘rend’) == ‘hi ‘
This may throw an “is None” exception. I can get around it by saying
If element.get(‘rend’) is not None and element.get(‘rend’) == ‘hi’
I don't follow. How does this raise an exception?
from lxml import etree root = etree.Element('root') print etree.tostring(root, pretty_print=True) <root/>
root.get('root_does_not_have_this_attribute') == 'hi' False
Maybe I misunderstand the question but the get() method actually returns None as a default value if an element doesn't actually carry the attribute of interest. You can also provide a custom default value if need be, just like for {}.get():
root.get('root_does_not_have_this_attribute', 'the default You get') 'the default You get'
Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart
Here is a real example:
if c.getnext().tag == tei + 'c':
AttributeError: 'NoneType' object has no attribute 'tag'
I fix it by saying
if c.getnext() is not None and c.getnext().tag== tei + 'c':
My question is whether there always has to be an explicit assertion that the element in question “is not None”.
On 12/22/16, 2:04 AM, "lxml on behalf of Holger Joukl"
Here is a real example:
if c.getnext().tag == tei + 'c': AttributeError: 'NoneType' object has no attribute 'tag'
I fix it by saying
if c.getnext() is not None and c.getnext().tag== tei + 'c':
My question is whether there always has to be an explicit assertion that the element in question “is not None”.
I see. It's not about the .get() but the .getnext() method. I'd probably simply do
next_elem = c.getnext() if next_elem is not None: if next_elem.tag == tei + 'c': # ...
or wrap that into a little helper function like Konstantin suggested (if you don't fear performance penalties of the function call, but that's entirely up to your use case and data size). You could also do
# rather ask for forgiveness than for permission try: next_tag = c.getnext().tag except AttributeError: next_tag = None if next_tag == tei + 'c': # ...
(which I don't like very much in this case because it's so verbose here, unless your 'if'-code path is really short and you can put it into the try-except completely without risking to catch other potential AttributeErrors) or if getattr(c.getnext(), 'tag', None) == tei + 'c': ... Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart
Thank you, Holger and Konstantin, for very useful advice
On 12/22/16, 10:01 AM, "lxml on behalf of Holger Joukl"
participants (3)
-
Holger Joukl
-
Konstantin Khomoutov
-
Martin Mueller