Re: [lxml-dev] Better error reporting

16 Mar 2006

      Martijn Faassen wrote:
...
Another question: How does error logging work in combination with
threads? I noticed that the code in lxml that turned off the
talkativeness of libxml2 actually only worked for the main thread, and
that new threads that use lxml do become talkative again.
According to the libxml2 docs, that's intentional. Each thread has to
configure that for itself. Currently, there isn't that much in lxml anyway
that takes care of threads. Everything that's module level will interfere.

A way to get around this would be to set an error log in each sensible
function. Hmm, I actually think that would be the right way. I'll code this up
and see how it turns out.
...
...
libxml2 gives you this:
int    domain    : What part of the library raised this er
    int    code    : The error code, e.g. an xmlParserError
    char *    message    : human-readable informative error messag
    xmlErrorLevel    level    : how consequent is the error
    char *    file    : the filename
    int    line    : the line number if available
    char *    str1    : extra string information
    char *    str2    : extra string information
    char *    str3    : extra string information
    int    int1    : extra number information
    int    int2    : column number of the error or 0 if N/A
    void *    ctxt    : the parser context if available
    void *    node    : the node in the tree
The problem is: the more information you put into the log, the slower the
application becomes. Providing the element that triggered the error, for
example, would rather be out of scope. Note that you have to convert this
information to Python representations in order to store it in the log.
I'm not too concerned that slowing down exceptions somewhat is going to
impact things that badly - these exceptions are typically not occuring
very often. Since it's lxml's mission to make libxml2 usable by mortal
python programmers with a nice API, I consider it part of our mission to
make the error API as nice as possible too, providing as much
information as we can, in an easy to understand way.
That's all future music though. I think this is already a great step
forward, I'm just pointing where I'd like to go.
I also thought a bit more about this. It would be better to store more
information and then allow filtering based on domain and error codes. RNG
classes should only return RNG errors, for example (although earlier failures
may have contributed to the current error...).

Maybe use a dedicated log entry class rather than plain strings?
...
...
...
We also have the case for RelaxNG/Schema reporting where no exception is
raised if the XML is not valid according to the schema.
I added error_log properties to the RelaxNG and XMLSchema classes.
That should
solve that problem.
Another way that might be more consistent is to add new methods that
either silently validate or, in case of validation errors, raise an
exception.
Hmmm, I don't know. If that's only for retrieving more precise error
information... Maybe a method like "assert" could be meaningful here.

Stefan

Re: [lxml-dev] Better error reporting

Stefan Behnel