XML help needed
Martin von Loewis
loewis at informatik.hu-berlin.de
Sun Nov 25 07:58:06 EST 2001
"Duncan Smith" <buzzard at urubu.freeserve.co.uk> writes:
> I have a lot of code written for a particular application I'm
> working on and I've got to the point where I need to access data
> from XML files. I will also need to save to XML files. There's
> almost too much information available and I'd appreciate any advice.
> SAX? DOM?
You don't specify what you mean by "accessing", most likely, you will
need a tree representation of the document, i.e. the DOM. The only
reason *not* to use the DOM would be if the document is too large to
fit into memory.
> I have the proposed DTD and examples of supposedly valid XML files. The XML
> files are apparently well-formed, but (according to XML Spy) not valid.
> 'Invalid value for datatype NMTOKENS in attribute INDEXES'. I think the
> following are the relevant lines from the DTD and XML respectively. Can
> anyone tell me what's wrong? Thanks in advance.
Looks like a bug in XML Spy. According to the XML recommendation, an
attribute of type NMTOKENS must follow the production
Nmtokens ::= Nmtoken (S Nmtoken)*
Nmtoken ::= (NameChar)+
NameChar ::= Letter | Digit | '.' | '-' | '_' | ':'
| CombiningChar | Extender
S ::= (#x20 | #x9 | #xD | #xA)+
On a shallow glance, the attribute value " 0 0 " does not match this
production, since it begins with a space, whereas nmtokens must begin
with NameChar.
However, this ignores attribute normalization:
# Before the value of an attribute is passed to the application or
# checked for validity, the XML processor must normalize the attribute
# value by applying the algorithm below
...
# If the attribute type is not CDATA, then the XML processor must
# further process the normalized attribute value by discarding any
# leading and trailing space (#x20) characters, and by replacing
# sequences of space (#x20) characters by a single space (#x20)
# character.
HTH,
Martin
More information about the Python-list
mailing list