Mailman 3 is there a way of getting around endless repetitions of "is not None" ? - lxml - The Python XML Toolkit

is there a way of getting around endless repetitions of "is not None" ?

older
Re: [lxml] Potential memory leak...

Martin Mueller

22 Dec 2016 22 Dec '16

7:50 a.m.

This may be a primitive Python question rather than an lxml question, but I’ll ask it anyhow. If I want to find out whether the attribute of some element has a particular value, I can’t just say If element.get(‘rend’) == ‘hi ‘ This may throw an “is None” exception. I can get around it by saying If element.get(‘rend’) is not None and element.get(‘rend’) == ‘hi’ Is there some way of getting around these tedious “is not None” explicitations or do I always have to say it. I know that computers are very literal animals.

Show replies by date

Konstantin Khomoutov

22 Dec 22 Dec

11:52 a.m.

On Thu, 22 Dec 2016 02:20:52 +0000 Martin Mueller wrote:

...

This may be a primitive Python question rather than an lxml question, but I’ll ask it anyhow.

If I want to find out whether the attribute of some element has a particular value, I can’t just say

If element.get(‘rend’) == ‘hi ‘

This may throw an “is None” exception. I can get around it by saying

If element.get(‘rend’) is not None and element.get(‘rend’) == ‘hi’

Is there some way of getting around these tedious “is not None” explicitations or do I always have to say it. I know that computers are very literal animals.

What's the problem of using a dead-simple helper? def attr_of(el, name): v = el.get(name) if v is None: return '' ... if attr_of(element, 'rend') == 'hi': pass or def attr_is(el, name, val): v = el.get(name) if v is None: return False return v == val ... if attr_is(element, 'rend', 'hi'): pass I'm afraid you might have that OO-instilled fear of using non-methods on objects ;-) Fear not -- they are just fine. Alternatively you can use still supported 'attrib' dict-like object to access attributes on elements: it will throw a KeyError exception exception at you when you try to acces an attribute which does not exist: if element.attrib['rend'] == 'hi': pass # might blow up at you Well, there's also another approach which is typically used (I think) to parse complex XML documents: custom Python XML element classes. They are detailed in [1] but the basic idea is that you write a set of classes representing elements in the documents you parse, make an instance of lxml.etree.XMLParser(), arm it with the knowledge of how to look your custom classes up and then in those classes you make properties representing your attributes and child elements of interest, like in class Foo(etree.ElementBase): @property def rend(self): v = self.get('rend') if v is None: raise ValueError('Attribute not found: rend') return v And then after parsing your "element" variable will have (if you supposedly arranger for this properly) type Foo, and you could just do if element.rend == 'hi': pass # Will blow up with ValueError Of course, in a real-world implementation you'd use some generic code to access attributes and throw exceptions when they are not found -- to reduce the necessary boilerplate in the implementation. P.S. Well, I'm personally dreaming of something like [2] for this lxml's facility so I could just declare the necessary properties by just spelling their names, types (element vs attribute) and whether they must exist or not. 1. http://lxml.de/element_classes.html 2. https://attrs.readthedocs.io/en/stable/

Holger Joukl

1:34 p.m.

Hi Martin,

...

If I want to find out whether the attribute of some element has a particular value, I can’t just say

If element.get(‘rend’) == ‘hi ‘

This may throw an “is None” exception. I can get around it by saying

If element.get(‘rend’) is not None and element.get(‘rend’) == ‘hi’

I don't follow. How does this raise an exception?

...

...
...
from lxml import etree root = etree.Element('root') print etree.tostring(root, pretty_print=True) <root/>

...

...
...
root.get('root_does_not_have_this_attribute') == 'hi' False

Maybe I misunderstand the question but the get() method actually returns None as a default value if an element doesn't actually carry the attribute of interest. You can also provide a custom default value if need be, just like for {}.get():

...

...
...
root.get('root_does_not_have_this_attribute', 'the default You get') 'the default You get'

Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart

Martin Mueller

8:47 p.m.

Here is a real example: if c.getnext().tag == tei + 'c': AttributeError: 'NoneType' object has no attribute 'tag' I fix it by saying if c.getnext() is not None and c.getnext().tag== tei + 'c': My question is whether there always has to be an explicit assertion that the element in question “is not None”. On 12/22/16, 2:04 AM, "lxml on behalf of Holger Joukl" wrote: Hi Martin, > If I want to find out whether the attribute of some element has a > particular value, I can’t just say > > If element.get(‘rend’) == ‘hi ‘ > > This may throw an “is None” exception. I can get around it by saying > > If element.get(‘rend’) is not None and element.get(‘rend’) == ‘hi’ I don't follow. How does this raise an exception? >>> from lxml import etree >>> root = etree.Element('root') >>> print etree.tostring(root, pretty_print=True) <root/> >>> root.get('root_does_not_have_this_attribute') == 'hi' False >>> Maybe I misunderstand the question but the get() method actually returns None as a default value if an element doesn't actually carry the attribute of interest. You can also provide a custom default value if need be, just like for {}.get(): >>> root.get('root_does_not_have_this_attribute', 'the default You get') 'the default You get' >>> Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart _________________________________________________________________ Mailing list for the lxml Python XML toolkit - https://urldefense.proofpoint.com/v2/url?u=http-3A__lxml.de_&d=CwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=rG8zxOdssqSzDRz4x1GLlmLOW60xyVXydxwnJZpkxbk&m=zXcJFJ-ZASbpAoUNSxQUsqjR2IWWCv7-xQW7WqrgfEs&s=oGrXEBbp6uZpAyiWhQyYdWOBHeu4iua2OHom_IN-JkA&e= lxml@lxml.de https://urldefense.proofpoint.com/v2/url?u=https-3A__mailman-2Dmail5.webfaction.com_listinfo_lxml&d=CwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=rG8zxOdssqSzDRz4x1GLlmLOW60xyVXydxwnJZpkxbk&m=zXcJFJ-ZASbpAoUNSxQUsqjR2IWWCv7-xQW7WqrgfEs&s=G_58ZhfpJJFgv0EsYq1X7lp0rj6dv17TU86QahHyLk8&e=

Holger Joukl

9:31 p.m.

...

Here is a real example:

if c.getnext().tag == tei + 'c': AttributeError: 'NoneType' object has no attribute 'tag'

I fix it by saying

if c.getnext() is not None and c.getnext().tag== tei + 'c':

My question is whether there always has to be an explicit assertion that the element in question “is not None”.

I see. It's not about the .get() but the .getnext() method. I'd probably simply do

...

...
...
next_elem = c.getnext() if next_elem is not None: if next_elem.tag == tei + 'c': # ...

or wrap that into a little helper function like Konstantin suggested (if you don't fear performance penalties of the function call, but that's entirely up to your use case and data size). You could also do

...

...
...
# rather ask for forgiveness than for permission try: next_tag = c.getnext().tag except AttributeError: next_tag = None if next_tag == tei + 'c': # ...

(which I don't like very much in this case because it's so verbose here, unless your 'if'-code path is really short and you can put it into the try-except completely without risking to catch other potential AttributeErrors) or if getattr(c.getnext(), 'tag', None) == tei + 'c': ... Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart

Martin Mueller

9:37 p.m.

Thank you, Holger and Konstantin, for very useful advice On 12/22/16, 10:01 AM, "lxml on behalf of Holger Joukl" wrote: > Here is a real example: > > if c.getnext().tag == tei + 'c': > AttributeError: 'NoneType' object has no attribute 'tag' > > > I fix it by saying > > if c.getnext() is not None and c.getnext().tag== tei + 'c': > > My question is whether there always has to be an explicit assertion > that the element in question “is not None”. I see. It's not about the .get() but the .getnext() method. I'd probably simply do >>> next_elem = c.getnext() >>> if next_elem is not None: >>> if next_elem.tag == tei + 'c': >>> # ... or wrap that into a little helper function like Konstantin suggested (if you don't fear performance penalties of the function call, but that's entirely up to your use case and data size). You could also do >>> # rather ask for forgiveness than for permission >>> try: >>> next_tag = c.getnext().tag >>> except AttributeError: >>> next_tag = None >>> if next_tag == tei + 'c': >>> # ... (which I don't like very much in this case because it's so verbose here, unless your 'if'-code path is really short and you can put it into the try-except completely without risking to catch other potential AttributeErrors) or if getattr(c.getnext(), 'tag', None) == tei + 'c': ... Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart _________________________________________________________________ Mailing list for the lxml Python XML toolkit - https://urldefense.proofpoint.com/v2/url?u=http-3A__lxml.de_&d=CwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=rG8zxOdssqSzDRz4x1GLlmLOW60xyVXydxwnJZpkxbk&m=vpl5qf17iO-LwAptAezScaqAJg_foVy3y9e853PoxG8&s=czQ33HnYKVYarnb1I_nJgkwoUhkwDHBaVTRO1FYp_Yo&e= lxml@lxml.de https://urldefense.proofpoint.com/v2/url?u=https-3A__mailman-2Dmail5.webfaction.com_listinfo_lxml&d=CwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=rG8zxOdssqSzDRz4x1GLlmLOW60xyVXydxwnJZpkxbk&m=vpl5qf17iO-LwAptAezScaqAJg_foVy3y9e853PoxG8&s=hWzGeF1LCKM75T1saFbqQ6nFo-b3a26S6CEFELwuFfc&e=

2681

Age (days ago)

2681

Last active (days ago)

List overview

Download

5 comments

3 participants

participants (3)

Holger Joukl
Konstantin Khomoutov
Martin Mueller

is there a way of getting around endless repetitions of "is not None" ?

Martin Mueller

Konstantin Khomoutov

Holger Joukl

Martin Mueller

Holger Joukl

Martin Mueller

tags

participants (3)