[Tutor] XML parsing

Thu Mar 29 15:41:46 EDT 2018

Asif Iqbal wrote:

> On Thu, Mar 29, 2018 at 3:56 AM, Peter Otten <__peter__ at web.de> wrote:
> 
>> Asif Iqbal wrote:
>>
>> > I am trying to extract all the *template-name*s, but no success yet
>> >
>> > Here is a sample xml file
>> >
>> > <collection xmlns:y="http://tail-f.com/ns/rest">
>> >   <template-metadata xmlns="http://networks.com/nms">
>> >     <template-name>ALLFLEX-BLOOMINGTON</template-name>
>> >     <type>post-staging</type>
>> >     <device-type>full-mesh</device-type>
>> >     <provider-tenant>ALLFLEX</provider-tenant>
>> >     <subscription xmlns="http://networks.com/nms">
>> >       <solution-tier>advanced-plus</solution-tier>
>> >       <bandwidth>1000</bandwidth>
>> >       <is-analytics-enabled>true</is-analytics-enabled>
>> >       <is-primary>true</is-primary>
>> >     </subscription>
>> > ....
>> > </collection>
>> >
>> > with open('/tmp/template-metadata') as f:
>> >     import xml.etree.ElementTree as ET
>> >     root = ET.fromstring(f.read())
>> >
>> > print len(root)
>> > print root[0][0].text
>> > for l in root.findall('template-metadata'):
>> >     print l
>> >
>> >
>> > 392
>> > ALLFLEX-BLOOMINGTON
>> >
>> >
>> > It prints the length of the tree and the first element of the first
>> child,
>> > but when I try to loop through to find all the 'template-name's
>> > it does not print anything.
>> >
>> > What am I doing wrong?
>>
>> You have to include the namespace:
>>
>> for l in root.findall('{http://networks.com/nms}template-metadata'):
>>
> 
> How do I extract the 'template-name' ?

I hoped you'd get the idea. 

> This is what I tried
> 
>  for l in root.findall('{http://networks.com/nms}template-metadata'):

Rinse and repeat:

>     print l.find('template-name').text

should be

    print l.find('{http://networks.com/nms}template-name').text

> 
> I am following the doc
> https://docs.python.org/2/library/xml.etree.elementtree.html section
> 19.7.1.3 findall example
> 
> I get this error attribute error 'NoneType' object has no attribute text.
> I do not understand why l.find('template-name') is NoneType.

Take the time to read

https://docs.python.org/2/library/xml.etree.elementtree.html#parsing-xml-with-namespaces

> Here is complete code with output.
> 
> 
> import xml.etree.ElementTree as ET
> 
> xmlfile='''
> <collection xmlns:y="http://tail-f.com/ns/rest">
>   <template-metadata xmlns="http://networks.com/nms">
>     <template-name>ALLFLEX-BLOOMINGTON</template-name>
>     <type>post-staging</type>
>     <device-type>full-mesh</device-type>
>     <provider-tenant>ALLFLEX</provider-tenant>
>     <subscription xmlns="http://networks.com/nms">
>       <solution-tier>advanced-plus</solution-tier>
>       <bandwidth>1000</bandwidth>
>       <is-analytics-enabled>true</is-analytics-enabled>
>       <is-primary>true</is-primary>
>     </subscription></template-metadata></collection>'''
> 
> root = ET.fromstring(xmlfile)
> print root.tag
> print root[0][0].text
> for l in root.findall('{http://networks.com/nms}template-metadata'):
>     print l.find('template-name').text
> 
> collection
> ALLFLEX-BLOOMINGTON
> 
> 
---------------------------------------------------------------------------
AttributeError
>                            Traceback (most recent call
> last)<ipython-input-18-73bd6770766a> in <module>()     19 print
> root[0][0].text     20 for l in
> root.findall('{http://networks.com/nms}template-metadata'):---> 21
> print l.find('template-name').text
> AttributeError: 'NoneType' object has no attribute 'text'