Help parsing XML

Dag Sunde dag at orion.no
Mon Jul 2 04:14:49 EDT 2001


Try changing the order of:
    <!DOCTYPE events SYSTEM "xml-concerts.dtd">
    <?xml version="1.0" encoding-"ISO-8859-1"?>

to:
    <?xml version="1.0" encoding="ISO-8859-1"?>
    <!DOCTYPE events SYSTEM "xml-concerts.dtd">

...and note the corrected "encoding=" instead of
"encoding-"...

It is not legal to have anything before "<?xml...",
if present.

Dag.


"Skip Montanaro" <skip at pobox.com> wrote in message
news:mailman.994060443.31499.python-list at python.org...
>
> Warning: I am a complete XML novice trying to parse some XML I dreamed up
> with xml.sax.parse.
>
> I wrote a simple subclass of xml.sax.ContentHandler to parse some XML.  It
> defines startElement, endElement and characters methods.  If I feed it
> something simple like so:
>
>     <events>
>       <event>
> <performers>
>   <performer>James Taylor</performer>
>   <performer>Carly Simon</performer>
> </performers>
> <keywords>
>   <keyword>pop</keyword>
>   <keyword>rock</keyword>
>   <keyword>vocals</keyword>
> </keywords>
> <start-date>20010701T20:00</start-date>
> <end-date>20010701T23:00</end-date>
> <admission-price>$30</admission-price>
> <venue-name>Pepsi Arena</venue-name>
> <city>Albany</city>
> <state>CA</state>
> <country>US</country>
> <submitter-name>Skip Montanaro</submitter-name>
> <submitter-email>skip at mojam.com</submitter-email>
>       </event>
>     </events>
>
> it works fine.  However, note the absence of <!DOCTYPE ...> and <?xml ...>
> tags at the start.  If I add them at the start of the XML file like so:
>
>     <!DOCTYPE events SYSTEM "xml-concerts.dtd">
>     <?xml version="1.0" encoding-"ISO-8859-1"?>
>     <events>
>       <event>
>       ...
>
> Python squawks:
>
>     Traceback (most recent call last):
>       File "parsexml.py", line 72, in ?
> xml.sax.parse("concert.xml", h)
>       File "/usr/local/lib/python2.1/xml/sax/__init__.py", line 33, in
parse
> parser.parse(source)
>       File "/usr/local/lib/python2.1/xml/sax/expatreader.py", line 43, in
parse
> xmlreader.IncrementalParser.parse(self, source)
>       File "/usr/local/lib/python2.1/xml/sax/xmlreader.py", line 123, in
parse
> self.feed(buffer)
>       File "/usr/local/lib/python2.1/xml/sax/expatreader.py", line 92, in
feed
> self._err_handler.fatalError(exc)
>       File "/usr/local/lib/python2.1/xml/sax/handler.py", line 38, in
fatalError
> raise exception
>     xml.sax._exceptions.SAXParseException: concert.xml:2:0: xml processing
instruction not at start of external entity
>
> If I zap the <!DOCTYPE ...> tag I get this:
>
>     Traceback (most recent call last):
>       File "parsexml.py", line 72, in ?
> xml.sax.parse("concert.xml", h)
>       File "/usr/local/lib/python2.1/xml/sax/__init__.py", line 33, in
parse
> parser.parse(source)
>       File "/usr/local/lib/python2.1/xml/sax/expatreader.py", line 43, in
parse
> xmlreader.IncrementalParser.parse(self, source)
>       File "/usr/local/lib/python2.1/xml/sax/xmlreader.py", line 123, in
parse
> self.feed(buffer)
>       File "/usr/local/lib/python2.1/xml/sax/expatreader.py", line 92, in
feed
> self._err_handler.fatalError(exc)
>       File "/usr/local/lib/python2.1/xml/sax/handler.py", line 38, in
fatalError
> raise exception
>     xml.sax._exceptions.SAXParseException: concert.xml:1:41: syntax error
>
> If you're wondering what sort of XML I'm trying to parse, my first cut at
a
> DTD (I'm a complete novice at DTD writing as well) is at
>
>     http://musi-cal.mojam.com/~skip/concerts.dtd
>
> I suspect this is more a problem with my lack of XML expertise than with
> anything specific I'm doing wrong with my content handler.  Suggestions or
> pointers to appropriate XML tutorial material would be appreciated.
>
> Thanks,
>
> --
> Skip Montanaro (skip at pobox.com)
> (847)971-7098
>





More information about the Python-list mailing list