XML help needed

Martin von Loewis loewis at informatik.hu-berlin.de
Sun Nov 25 13:58:06 CET 2001


"Duncan Smith" <buzzard at urubu.freeserve.co.uk> writes:

> I have a lot of code written for a particular application I'm
> working on and I've got to the point where I need to access data
> from XML files.  I will also need to save to XML files.  There's
> almost too much information available and I'd appreciate any advice.
> SAX?  DOM?

You don't specify what you mean by "accessing", most likely, you will
need a tree representation of the document, i.e. the DOM. The only
reason *not* to use the DOM would be if the document is too large to
fit into memory.

> I have the proposed DTD and examples of supposedly valid XML files.  The XML
> files are apparently well-formed, but (according to XML Spy) not valid.
> 'Invalid value for datatype NMTOKENS in attribute INDEXES'.  I think the
> following are the relevant lines from the DTD and XML respectively.  Can
> anyone tell me what's wrong?  Thanks in advance.

Looks like a bug in XML Spy. According to the XML recommendation, an
attribute of type NMTOKENS must follow the production

Nmtokens    ::=    Nmtoken (S Nmtoken)*
Nmtoken     ::=    (NameChar)+
NameChar    ::=    Letter | Digit | '.' | '-' | '_' | ':' 
                 | CombiningChar | Extender
S           ::=    (#x20 | #x9 | #xD | #xA)+

On a shallow glance, the attribute value " 0 0 " does not match this
production, since it begins with a space, whereas nmtokens must begin
with NameChar.

However, this ignores attribute normalization:

# Before the value of an attribute is passed to the application or
# checked for validity, the XML processor must normalize the attribute
# value by applying the algorithm below
...
# If the attribute type is not CDATA, then the XML processor must
# further process the normalized attribute value by discarding any
# leading and trailing space (#x20) characters, and by replacing
# sequences of space (#x20) characters by a single space (#x20)
# character.

HTH,
Martin



More information about the Python-list mailing list