[XML-SIG] Using PyExpat.py

Lars Marius Garshol larsga@garshol.priv.no
19 Feb 2001 23:49:03 +0100


* Guido van Rossum
| 
| I'd like to drop support for URLs; I don't think the typical
| computer is sufficiently networked to make this work well.

Dropping support for URLs not really an option when dealing with XML.
The XML recommendation states clearly that all system identifiers[1]
are URIs in XML.

What this really means is that we have two cases to deal with:

  - XML software is provided a reference to an XML document
  - XML document references as used internally by XML software and
    also as passed back out to client software

In the second case the references must be URIs, since it is a
deep-seated assumption in the entire XML family of specifications that
all such references will be URIs. This is especially clear in the case
of entity references (as Uche illustrated), but most other XML
specifications are equally clear on this point, such as the infoset,
XSLT, XBase and so on.

Of course, in the first case there is no reason why it shouldn't be
allowed to pass file names into the APIs to have them converted into
URIs there. In fact, I think there is very good reason to do so, since
my experience with the Java tools that require URIs have been fairly
painful. (Who remembers the precise syntax for file URIs on all kinds
of platforms anyway?)

Outlawing URIs, however is not really an option.

[1] What most people would call 'references to external resources',
    usually files.

| I would suggest to have separate APIs depending on the argument
| type, e.g. p.parseFile(filename), p.parseURL(url),
| p.parseStream(InputSource), p.parseString(text).

That may be a better option than to have a single function/method, but
that is really separate from the issue of whether to allow URIs or
not.

--Lars M.