[XML-SIG] Using PyExpat.py

Guido van Rossum guido@digicool.com
Sat, 10 Feb 2001 17:13:23 -0500


> xml_dom_object = reader.fromUri(filename)  #should work for either URL or file

Let's talk about this comment.  Is it really a good idea to build URL
access right into the API here?  For apps that need this, it's trivial
to write as long as the reader takes an open file object ("stream") as
an alternative to a filename: just call urllib.urlopen(uri) and pass
it as the argument.

Case in point: I found this bit in saxutilx.py:

        if os.path.isfile(sysid):
            basehead = os.path.split(os.path.normpath(base))[0]
            source.setSystemId(os.path.join(basehead, sysid))
            f = open(sysid, "rb")
        else:
            source.setSystemId(urlparse.urljoin(base, sysid))
            f = urllib.urlopen(source.getSystemId())

Now I don't know under which circumstances this get triggered (the
context is obscure), but I'd say it's a bad idea to just try to open a
URL when a string isn't a local file.  Maybe *you* live in a world
where the network is "always on" (and I do too!), but for plenty of
folks, it's rather annoying to find that their modem starts dialing
out each time they make a typo in a filename.

Besides, the syntax for local filenames and URLs is not the same; the
quoting conventions are different and it's quite possible to find that
the same name could be either a URL or a filename, with vastly
different interpretations.  (See nturl2path.)  Without more context,
it's unclear which syntax should be tried first.  The application
knows this, but the library doesn't.  It's also fine to have an
alternative API that takes a URL instead of a local filename -- but
it's not okay to attempt to overlap the two namespaces.

--Guido van Rossum (home page: http://www.python.org/~guido/)