[XML-SIG] Normalized AttVals
Paul Prescod
paul@prescod.net
Mon, 14 Dec 1998 17:55:52 -0600
John Day wrote:
>
> Re: quoted attribute contents ("AttVal")
> When '>' is encountered e.g. <code op=">"> it is "normalized"
> to '>', however, when '&' is encountered it is a fatal
> error e.g. <a href="www.zzz.com?a=1&b=3">
That's what the XML spec says.
AttValue ::= '"' ([^<&"] | Reference)* '"'
| "'" ([^<&'] | Reference)* "'"
That means that "<" and "&" are never allowed in attribute values except
as parts of an attribute reference.
> Is this pyexpat behavior correct? Why can't the parser tell that
> '&b' above is _not_ a defined entity because it is not terminated
> by ';'?
That's what full SGML does, but that's not what XML does. XML is supposed
to be easier to implement.
> It seems to me that this usage could be normalized to
> '&b', just like pyexpat did for '>'. Then it would be backward
> compatible with HTML (sort of).
There are several ways that it isn't backwards compatible with HTML
> The impact of this seems to be enormous. All of the existing HTML
> parameter generators will have to change the way they post arguments,
> when HTML is replaced by XML, right?
This has been a known problem for a long time.
http://www.uni-ulm.de/uni/fak/natwis/strudo/ampersand.html
Paul Prescod - ISOGEN Consulting Engineer speaking for only himself
http://itrc.uwaterloo.ca/~papresco
"Sports utility vehicles are gated communities on wheels" - Anon