Fastest way to extract used namespaces from a piece of XML?

Thomas Weholt thomas at cintra.no
Sat Jun 16 13:54:25 EDT 2001


I need the fastest code available for extracting all namespaces used in a
piece of xml. I got this code for doing this, but it's slow as hell. Any
clues?

from xml.sax import saxlib, saxexts
from cStringIO import StringIO
import string

class NamespaceParser(saxlib.HandlerBase):
    """NamespaceParser - extract namespaces used in a piece of xml"""
    def __init__(self):
        self.namespaces = {}
    def startElement(self, name, atts):
        try:
            namespace, tagname = string.splitfields(name, ':')
            self.namespaces[string.upper(str(namespace))] = None
        except:
            pass

def getNamespaces(xml_text):
    internal_parser = NamespaceParser()
    parser = saxexts.make_parser()
    parser.setDocumentHandler(internal_parser)
    xml = StringIO(xml_text)
    parser.parseFile(xml)
    return internal_parser.namespaces.keys()

Thomas





More information about the Python-list mailing list