[XML-SIG] SAX exceptions are odd

Lars Marius Garshol larsga@garshol.priv.no
05 Oct 2000 18:51:56 +0200


* Jeremy Hylton
| 
| If I call on parse on an empty file, I get no exception.  Is this
| desirable?  I assume it means that "" is well-formed XML, but that
| doesn't seem like a very helpful definition.  Is this right?

No, it's not right.  You should get an error telling you that the
document element is required.
 
| If I get almost any other exception I get an error message that says
| something like: "not well-formed at None:1:7"

Expat is not very good at providing informative error messages, so I
don't think you can expect much more.  If you want better error
messages you should probably use xmlproc or xmllib.

As for the None that should imply that you just gave the parser a
string to parse and didn't provide it with a system identifier (ie:
URL or file name).
 
| Why is None being printed?  It gave me the initial impression that my
| error was no setting up parse call correctly.  I assumed that the None
| was the cause of the exception and that under normal circumstances it
| would have said something like "not well-formed at foo.xml:1:7".

If you told it that you were parsing from foo.xml it should definitely
return that information in the error message.  Can you show us the
exact call to parse?
 
| What is a system identifier and why should it be reported in an
| exception when it is None?

The system identifier is SGML-speak (and XML-speak) for the location
of the document being parsed. I guess we could leave it out in the
cases where it is None, if people prefer that. (I personally have no
opinion on that.)
 
| I also think the format is odd.  There are three different pieces of
| information separated by colons.  I am accustomed to the notation
| filename:line number, but not another colon for the cursor position.
| It would have been clearer, I think, if the message were more
| verbose and explained what each field was.

How about this:

  "Not well-formed in foo.xml at line %d, column %d."

If you prefer that I'd be happy to change both that and the lost
system identifier (if that is indeed the problem).

--Lars M.