[XML-SIG] namespaces and sax questions
Roman Suzi
rnd@onego.ru
Fri, 7 Sep 2001 21:14:50 +0400 (MSD)
Hello!
I am trying to master XML and I can't understand wgat is "qualified name"
as understood by the sax.* modules of standard Python 2.1.1:
Here are my program, XML example and result:
--- run.py ---
import xml.sax, xml.sax.handler
from xml.sax.xmlreader import InputSource
class ContentHandler(xml.sax.handler.ContentHandler):
def startElementNS(self, name, qname, attrs):
print "name=", name, "qname=", qname
print "names:", attrs.getNames(),
print "qnames:", attrs.getQNames()
# def endElementNS(self, name, qname):
# print name, qname
def startPrefixMapping(self, prefix, URI):
print "START", prefix, URI
def endPrefixMapping(self, prefix):
print "END", prefix
input_source = InputSource()
input_source.setByteStream(open("W3CExample.xml", "r"))
xml_reader = xml.sax.make_parser()
xml_reader.setContentHandler(ContentHandler())
# while docs tell it is ON by default, it is not:
xml_reader.setFeature(xml.sax.handler.feature_namespaces, 1)
xml_reader.parse(input_source)
---
--- W3CExample.xml ---
<?xml version="1.0"?>
<!-- elements are in the HTML namespace, in this case by default -->
<html xmlns='http://www.w3.org/TR/REC-html40'>
<head><title>Frobnostication</title></head>
<body><p>Moved to
<a href='http://frob.com'>here</a>.</p></body>
</html>
---
And the result:
---
START None http://www.w3.org/TR/REC-html40
name= (u'http://www.w3.org/TR/REC-html40', u'html') qname= None
names: [] qnames: []
name= (u'http://www.w3.org/TR/REC-html40', u'head') qname= None
names: [] qnames: []
name= (u'http://www.w3.org/TR/REC-html40', u'title') qname= None
names: [] qnames: []
name= (u'http://www.w3.org/TR/REC-html40', u'body') qname= None
names: [] qnames: []
name= (u'http://www.w3.org/TR/REC-html40', u'p') qname= None
names: [] qnames: []
name= (u'http://www.w3.org/TR/REC-html40', u'a') qname= None
names: [(None, u'href')] qnames: []
END None
---
I do not see any "html:title", "html:head", ... in qnames while
http://www.w3.org/TR/REC-xml-names says what qname is:
Qualified Name
QName ::= (Prefix ':')? LocalPart
Prefix ::= NCName
LocalPart ::= NCName
Also, most of the features aren't supported by default xmlparser
(pyexpat), while Python docs do not tell so.
The same thing happens if I add "html:" to the tags explicitly.
What is the problem? How these observations could be explained?
Thanks!
Sincerely yours, Roman Suzi
--
_/ Russia _/ Karelia _/ Petrozavodsk _/ rnd@onego.ru _/
_/ Friday, September 07, 2001 _/ Powered by Linux RedHat 6.2 _/
_/ "Dreams are free, but you get soaked on the connect time." _/