Mailman 3 Baffled by documentation of namespace in XPath - lxml - The Python XML Toolkit

Nov. 29, 2012

      I have always found it difficult to wrap my head around the details of
namespaces in XML processing, but I am completely baffled by the
discussion and examples of namespaces in the XPath section.

Consider the following minimal TEI document:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model 
href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_bare.r
ng" schematypens="http://relaxng.org/ns/structure/1.0"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>A minimal TEI document</title>
      </titleStmt>
      <publicationStmt>
        <p>unpublished</p>
      </publicationStmt>
      <sourceDesc>
        <p>born digital</p>
      </sourceDesc>
    </fileDesc>
  </teiHeader>
  <text>
    <front>
      <div><head>Preface</head>
      <p>This is the preface</p></div>

    </front>
    <body>
      <div><head>Chapter 1</head>
        <div><head>Subsection 1.1</head><p>The text of 1.1</p></div>
        <div><head>Subsection 1.2</head><p>The text of 1.2</p></div>
      </div>

    </body>
  </text>
</TEI>

If I process this with xquery, I need to have a namespace declaration of
the type

	declare namespace tei = "http://www.tei-c.org/ns/1.0";

The miniscript

	xquery version "1.0";
	declare  namespace tei = "http://www.tei-c.org/ns/1.0";

	let $text :=   
doc('/users/martin/dropbox/learnpython.txt/bareguineapig.xml')
	return $text/tei:TEI//tei:front//tei:p

will return

	<?xml version="1.0" encoding="UTF-8"?>
	<p xmlns="http://www.tei-c.org/ns/1.0">This is the preface</p>

What is the equivalent for this normal case of an XML document?

The documentation gives the following XML fragment

<a:foo xmlns:a="http://codespeak.net/ns/test1"...
xmlns:b="http://codespeak.net/ns/test2">
<b:bar>Text</b:bar>
</a:foo>

I've never seen XML documents that use namespace prefixes in the closing
tags, and I can't figure out how to apply the information from the
documentation to my case.  I know how to process the document without a
namespace. For instance

	f = '/users/martin/dropbox/learnpython.txt/bareguineapignonamespace.xml'
	tree = etree.parse(f)
	r = tree.xpath('/TEI/text/front')
	for element in r[0].iter('p'):
   	   print element.text

will print out "This is the preface."  But what do I need to do to get the
same result when the TEI root element contains the tei namespace?

I also wonder whether there is an error in this code:

r = doc.xpath('/t:foo/b:bar',
 namespaces={'t': 'http://codespeak.net/ns/test1',
 'b': 'http://codespeak.net/ns/test2'}

Shouldn't the 't' be 'a'? It doesn't seem to affect the way the code
works, though
.

Martin Mueller

Professor of English and Classics
Northwestern University

Baffled by documentation of namespace in XPath

Martin Mueller

Simon Sapin

Stefan Behnel

Martin Mueller

jens quade

Martin Mueller

Simon Sapin

Stefan Behnel

Martin Mueller

jens quade

Martin Mueller

tags

participants (4)