On Thu, 22 Dec 2016 02:20:52 +0000 Martin Mueller <martinmueller@northwestern.edu> wrote:
This may be a primitive Python question rather than an lxml question, but I’ll ask it anyhow.
If I want to find out whether the attribute of some element has a particular value, I can’t just say
If element.get(‘rend’) == ‘hi ‘
This may throw an “is None” exception. I can get around it by saying
If element.get(‘rend’) is not None and element.get(‘rend’) == ‘hi’
Is there some way of getting around these tedious “is not None” explicitations or do I always have to say it. I know that computers are very literal animals.
What's the problem of using a dead-simple helper? def attr_of(el, name): v = el.get(name) if v is None: return '' ... if attr_of(element, 'rend') == 'hi': pass or def attr_is(el, name, val): v = el.get(name) if v is None: return False return v == val ... if attr_is(element, 'rend', 'hi'): pass I'm afraid you might have that OO-instilled fear of using non-methods on objects ;-) Fear not -- they are just fine. Alternatively you can use still supported 'attrib' dict-like object to access attributes on elements: it will throw a KeyError exception exception at you when you try to acces an attribute which does not exist: if element.attrib['rend'] == 'hi': pass # might blow up at you Well, there's also another approach which is typically used (I think) to parse complex XML documents: custom Python XML element classes. They are detailed in [1] but the basic idea is that you write a set of classes representing elements in the documents you parse, make an instance of lxml.etree.XMLParser(), arm it with the knowledge of how to look your custom classes up and then in those classes you make properties representing your attributes and child elements of interest, like in class Foo(etree.ElementBase): @property def rend(self): v = self.get('rend') if v is None: raise ValueError('Attribute not found: rend') return v And then after parsing your "element" variable will have (if you supposedly arranger for this properly) type Foo, and you could just do if element.rend == 'hi': pass # Will blow up with ValueError Of course, in a real-world implementation you'd use some generic code to access attributes and throw exceptions when they are not found -- to reduce the necessary boilerplate in the implementation. P.S. Well, I'm personally dreaming of something like [2] for this lxml's facility so I could just declare the necessary properties by just spelling their names, types (element vs attribute) and whether they must exist or not. 1. http://lxml.de/element_classes.html 2. https://attrs.readthedocs.io/en/stable/