How to extract tags with ":" in it?
data:image/s3,"s3://crabby-images/68281/682811131061ddf0a8ae288d02efca5f138e45a0" alt=""
Hi, I can not find how to extract tags with ":" in it. Can anybody show me the correct code to extract "arxiv:primary_category"? Thanks. $ ./main.py <<EOF <feed xmlns="http://www.w3.org/2005/Atom"> <entry> <arxiv:primary_category xmlns:arxiv="http://arxiv.org/schemas/atom" term="cs.NE" scheme="http://arxiv.org/schemas/atom"/> </entry> </feed> EOF $ cat main.py #!/usr/bin/env python # vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8: import sys from lxml import etree tree = etree.parse(sys.stdin) for e in tree.iterfind('{http://www.w3.org/2005/Atom}arxiv:primary_category'): print etree.tostring(e) -- Regards, Peng
data:image/s3,"s3://crabby-images/141ff/141ff9c7360caba1e5daae3e7232ff131cc85a48" alt=""
Hi, Am Wed, 18 Apr 2018 00:58:22 -0500 schrieb Peng Yu <pengyu.ut@gmail.com>:
Your XPath is wrong, it's not a valid XPath. :) As shown in my mail with the subject 'What "xmlns" does?', you need to define a prefix: tree.iterfind(".//arxiv:primary_category", namespaces={'arxiv':"http://arxiv.org/schemas/atom"}) It doesn't matter what you use as a prefix. This works too: tree.iterfind(".//x:primary_category", namespaces={'x':"http://arxiv.org/schemas/atom"}) -- Gruß/Regards, Thomas Schraitle
data:image/s3,"s3://crabby-images/141ff/141ff9c7360caba1e5daae3e7232ff131cc85a48" alt=""
Hi, Am Wed, 18 Apr 2018 00:58:22 -0500 schrieb Peng Yu <pengyu.ut@gmail.com>:
Your XPath is wrong, it's not a valid XPath. :) As shown in my mail with the subject 'What "xmlns" does?', you need to define a prefix: tree.iterfind(".//arxiv:primary_category", namespaces={'arxiv':"http://arxiv.org/schemas/atom"}) It doesn't matter what you use as a prefix. This works too: tree.iterfind(".//x:primary_category", namespaces={'x':"http://arxiv.org/schemas/atom"}) -- Gruß/Regards, Thomas Schraitle
participants (2)
-
Peng Yu
-
Thomas Schraitle