Telling Expat to ignore junk in XML feed

John Wilson tug at wilson.co.uk
Mon May 26 13:14:39 EDT 2003


----- Original Message ----- 
From: "Peter Clark" <pc451 at yahoo.com>
Newsgroups: comp.lang.python
To: <python-list at python.org>
Sent: Monday, May 26, 2003 3:11 PM
Subject: Telling Expat to ignore junk in XML feed


[snip]

> <?xml version="1.0"?>
> <br />
> <b>Warning</b>:  fopen(/home3/petersen/work/cache/weather/USMN0027)
> [<a href='ht
> tp://www.php.net/function.fopen'>function.fopen</a>]: failed to create
> stream: P
> ermission denied in
<b>/home3/petersen/work/production/weather/weather.php</b>
> o
> n line <b>114</b><br />
> <br />

Actually it's not the PHP error "junk" that Expat is objecting to. The first
thing it sees after the XML declaration is <br/>. According to the XML spec
the document can only contain a single root element. Anything after the
first <br/> element is junk no matter how nice it looks.

If you can edit the stream on the fly you could insert a dummy start element
after the XML declaration and the matching dummy end element at the end of
the feed:

<?xml version="1.0"?>
<document>
<br />
<b>Warning</b>:  fopen(/home3/petersen/work/cache/weather/USMN0027)
[<a href='ht
tp://www.php.net/function.fopen'>function.fopen</a>]: failed to create
stream: P
ermission denied in
<b>/home3/petersen/work/production/weather/weather.php</b>
......

</document>

That would probably keep Expat happy.

John Wilson
The Wilson Partnership
http://www.wilson.co.uk






More information about the Python-list mailing list