How to extract tags with ":" in it?
Hi,
I can not find how to extract tags with ":" in it. Can anybody show me
the correct code to extract "arxiv:primary_category"? Thanks.
$ ./main.py <
Hi,
Am Wed, 18 Apr 2018 00:58:22 -0500
schrieb Peng Yu
[...] $ ./main.py <
http://arxiv.org/schemas/atom" term="cs.NE" scheme="http://arxiv.org/schemas/atom"/> </entry> </feed> EOF $ cat main.py #!/usr/bin/env python # vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:
import sys from lxml import etree tree = etree.parse(sys.stdin) for e in tree.iterfind('{http://www.w3.org/2005/Atom}arxiv:primary_category'): print etree.tostring(e)
Your XPath is wrong, it's not a valid XPath. :) As shown in my mail with the subject 'What "xmlns" does?', you need to define a prefix: tree.iterfind(".//arxiv:primary_category", namespaces={'arxiv':"http://arxiv.org/schemas/atom"}) It doesn't matter what you use as a prefix. This works too: tree.iterfind(".//x:primary_category", namespaces={'x':"http://arxiv.org/schemas/atom"}) -- Gruß/Regards, Thomas Schraitle
participants (2)
-
Peng Yu
-
Thomas Schraitle