xmllib has fatal bug.

Mon Jan 10 18:06:55 EST 2000

Suppose I have the following Python code:

-------------------------------------------------------------------------
import xmllib
import uuid

class SchemaParser( xmllib.XMLParser ):

   def start_schema(self, attributes):
      print "<schema> detected."
      for a in attributes.keys():
         print "<schema>: %s='%s'" % (a, attributes[a])

   def end_schema(self):
       pass

   def start_element(self,attributes):
       print "<element> detected."

   def end_element(self):
       pass

xmldata = open("schema-file.xml","rb").read()
parser = SchemaParser()
parser.feed(xmldata)
-------------------------------------------------------------------------

Now, suppose I feed it the following XML data:

<schema targetNS='http://purl.org/metadata/dublin_core'
	version='1.0'
	xmlns='http://www.w3.org/1999/XMLSchema">
	<element/>
</schema>

It _should_ print out the following:

<schema> detected.
<schema>:targetNS='http://purl.org/metadata/dublin_core'
<schema>:version='1.0'
<schema>:xmlns='http://www.w3.org/1999/XMLSchema'
<element> detected.

Instead I get:

<element> detected.

If I remove the xmlns attribute, or I munge the name (e.g.,
change it to _xmlns), everything works A.O.K.  If I use something like
xmlns:armored='http://www.armored.net', it still works, although it strips
the xmlns: prefix from the armored tag.

What is up with this?  Can xmllib not handle namespaces?  What does it do
with a namespace it does find, and why does it invalidate the mere existance
of the tag?  It doesn't even _call_ the start_schema method.  :( :( :(

I'm using Python 1.5.2 under Linux.  Any help would be greatly appreciated.
Thanks in advance.

-- 
KC5TJA/6, DM13, QRP-L #1447
Samuel A. Falvo II
Oceanside, CA