[Expat-discuss] (no subject)

rolf@pointsman.de rolf@pointsman.de
Tue, 15 May 2001 01:35:03 +0200 (MEST)


On 14 May, Paul Prescod wrote:
> rolf@pointsman.de wrote:
>> o Declarations of attributes with type ENUMERATION and NOTATION are
>>   some kind of "normalized" - in a very straight forward way, I have
>>   to confess -, before they reach handler level. As far as I know,
>>   there is now reason within the XML recommendation for doing
>>   this. Don't geht me wrong, I'm far from criticizing this - it make
>>   life really a little bit easier, if you try to write a validator on
>>   top of expat - but this isn't documented and my questing is: can I
>>   trust in this?
> 
> If the attribute type is not CDATA, then the XML processor must further
> process the normalized attribute value by discarding any leading and
> trailing space (#x20) characters, and by replacing sequences of space
> (#x20) characters by a single space (#x20) character.

What you talk about is attribute _value_ normalization, XML
recommendation 3.3.3, and what you say is true.

But I have talked about attribute _declaration_. Something like

  <!ATTLIST someElement type (  numbered
                              | bullets
                              | somethingelse  ) #IMPLIED>

is reported by expat as att_type "(numbered|bullets|somethingelse)"
and something like

   <!ATTLIST someElement type NOTATION (  gif
                                        | jpg
                                        | png  ) #IMPLIED>

is reported by expat as att_type "NOTATION(gif|jpg|png)" 

I would say, the XML recommendation doesn't determine in any way, how
a parser have to report the allowed values of an enumeration or
notation attribute to an application. It's possible, to do it in one
'pre-cleared' string, as expat does is, or as an array of strings, or
in one 'raw' string, as the allowed values where found within the
document etc etc. Therefor, different parser implementations does it
in fact in different ways. For example xerces C++ returns the allowed
values as _space_ seperated values. If I'm wrong, please point me to
the according place within the XML recommendation.

rolf