[Tutor] XML parsing

Thu Mar 29 21:40:18 EDT 2018

On Thu, Mar 29, 2018 at 3:41 PM, Peter Otten <__peter__ at web.de> wrote:

> Asif Iqbal wrote:
>
> > On Thu, Mar 29, 2018 at 3:56 AM, Peter Otten <__peter__ at web.de> wrote:
> >
> >> Asif Iqbal wrote:
> >>
> >> > I am trying to extract all the *template-name*s, but no success yet
> >> >
> >> > Here is a sample xml file
> >> >
> >> > <collection xmlns:y="http://tail-f.com/ns/rest">
> >> >   <template-metadata xmlns="http://networks.com/nms">
> >> >     <template-name>ALLFLEX-BLOOMINGTON</template-name>
> >> >     <type>post-staging</type>
> >> >     <device-type>full-mesh</device-type>
> >> >     <provider-tenant>ALLFLEX</provider-tenant>
> >> >     <subscription xmlns="http://networks.com/nms">
> >> >       <solution-tier>advanced-plus</solution-tier>
> >> >       <bandwidth>1000</bandwidth>
> >> >       <is-analytics-enabled>true</is-analytics-enabled>
> >> >       <is-primary>true</is-primary>
> >> >     </subscription>
> >> > ....
> >> > </collection>
> >> >
> >> > with open('/tmp/template-metadata') as f:
> >> >     import xml.etree.ElementTree as ET
> >> >     root = ET.fromstring(f.read())
> >> >
> >> > print len(root)
> >> > print root[0][0].text
> >> > for l in root.findall('template-metadata'):
> >> >     print l
> >> >
> >> >
> >> > 392
> >> > ALLFLEX-BLOOMINGTON
> >> >
> >> >
> >> > It prints the length of the tree and the first element of the first
> >> child,
> >> > but when I try to loop through to find all the 'template-name's
> >> > it does not print anything.
> >> >
> >> > What am I doing wrong?
> >>
> >> You have to include the namespace:
> >>
> >> for l in root.findall('{http://networks.com/nms}template-metadata'):
> >>
> >
> > How do I extract the 'template-name' ?
>
> I hoped you'd get the idea.
>
> > This is what I tried
> >
> >  for l in root.findall('{http://networks.com/nms}template-metadata'):
>
> Rinse and repeat:
>
> >     print l.find('template-name').text
>
> should be
>
>     print l.find('{http://networks.com/nms}template-name').text
>
> >
> > I am following the doc
> > https://docs.python.org/2/library/xml.etree.elementtree.html section
> > 19.7.1.3 findall example
> >
> > I get this error attribute error 'NoneType' object has no attribute text.
> > I do not understand why l.find('template-name') is NoneType.
>
> Take the time to read
>
> https://docs.python.org/2/library/xml.etree.elementtree.
> html#parsing-xml-with-namespaces

Thanks for the links and hints.

I got it working now

I used ns = { 'nms' : 'http://networks.com/nms
<http://networks.com/nms%7Dtemplate-name').text>' }

And then l.find('nms:template-name', ns)

I also want to extract the namespace and I see this gets me the namespace

      str(root[0]).split('{')[1].split('}')[0]

Is there a better way to extract the name space?

>
>
> > Here is complete code with output.
> >
> >
> > import xml.etree.ElementTree as ET
> >
> > xmlfile='''
> > <collection xmlns:y="http://tail-f.com/ns/rest">
> >   <template-metadata xmlns="http://networks.com/nms">
> >     <template-name>ALLFLEX-BLOOMINGTON</template-name>
> >     <type>post-staging</type>
> >     <device-type>full-mesh</device-type>
> >     <provider-tenant>ALLFLEX</provider-tenant>
> >     <subscription xmlns="http://networks.com/nms">
> >       <solution-tier>advanced-plus</solution-tier>
> >       <bandwidth>1000</bandwidth>
> >       <is-analytics-enabled>true</is-analytics-enabled>
> >       <is-primary>true</is-primary>
> >     </subscription></template-metadata></collection>'''
> >
> > root = ET.fromstring(xmlfile)
> > print root.tag
> > print root[0][0].text
> > for l in root.findall('{http://networks.com/nms}template-metadata'):
> >     print l.find('template-name').text
> >
> > collection
> > ALLFLEX-BLOOMINGTON
> >
> >
> ------------------------------------------------------------
> ---------------
> AttributeError
> >                            Traceback (most recent call
> > last)<ipython-input-18-73bd6770766a> in <module>()     19 print
> > root[0][0].text     20 for l in
> > root.findall('{http://networks.com/nms}template-metadata'):---> 21
> > print l.find('template-name').text
> > AttributeError: 'NoneType' object has no attribute 'text'
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>

-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?