[Expat-bugs] Problems with parsing attributes with XML:Parser

Brian Cameron Brian.Cameron at Sun.COM
Tue Jan 20 17:43:30 EST 2004


I have discovered the following problems trying to parse XML using the
XML:Parser.  I believe these problems are bugs:

Its seems that XML:Parser is not really using an XML parser to get the
attribute values, but instead doing things like:
 >
 >     # Grab attribute key/value pairs and push onto @origlist array.
 >     #
 >     while ($source =~ s|^[ \t]*([\w:]+)[ \t]*[=][ \t]*["]([^"]*)["]||s)
 >     {
 >        push @origlist, $1;
 >        push @origlist, $2;

This has some issues for me:

a) It removes any attributes using ' as quote
b) It strips out multiple whitespaces inside an attribute value
c) (part of a) really), doesn't handle " quotes inside '-quoted
    attributes
d) Attribute names that contain the - character, such as
    test-att3 are not recognized as attributes.

It is my understanding that the XML spec does allow for users to use
these features.

We are currently using the XML:Parser in the GNOME intltool project,
but unless the above issues are resolved, we may have to switch to
a different XML parser.  It would be ideal if this could be corrected
in the XML:Parser logic, so that we can continue using it.

Thanks!

-- 

Brian




More information about the Expat-bugs mailing list