[XML-SIG] file urls in urllib

Thomas B. Passin tpassin@home.com
Wed, 7 Mar 2001 19:48:53 -0500


Mark D. Anderson writes about file: urls.

Mark, here is a copy of a message I posted last month on this tricky subject.
I've been hoping to get some agreement on the usage so we can start building
it in.  I'm glad you brought it up.

Cheers,

Tom P


==================================================
This file: business is trickier than it seems, because the RFC is ambiguous
for file: urls.  A pipe character isn't in the rfc at all even though it's
used by some of the browsers.

I strongly suggest that when a local file is intended, that one should use the
file: scheme.  That way, the application doesn't have to guess and it won't
try a spurious url if the file isn't found.  The way it's done in this example
is just asking for continuous trouble, as I guess we're seeing now.

I think we should come to an agreement with the maintainer of the urllib about
the allowed forms for file: schemes.  It's mainly on Windows (and, perhaps,
Macs) that there would be a problem.  My preferred forms are these, for a file
at d:\temp\python\thefile.xml -

1) file:///d:/temp/python/thefile.xml

2) file:///d:\temp\python\thefile.xml

Both of these comply fully with the rfc.  2) is an "opaque" form - no further
parsing would be done by the url processor, it would just pass it to the os.
1) is what you get according to the rfc when you want the url processor to be
able to parse out the path parts.  The processor is supposed to know to
replace slashes by backslashes if appropriate for the os.

Either 1) or 2) would also work for files on a network file system, if you put
the host name in there -

file://host/temp/python/thefile.xml

1) would be more portable, and is my preference.  The processor should be able
to handle both, however.  For backwards compatibility, form 3) should also be
accepted, I suppose:

3) file:d:\temp\python\thefile.xml

This could be negotiated, though.

Let's agree on this and get it working right!