[XML-SIG] file urls in urllib
Martin v. Loewis
martin@loewis.home.cs.tu-berlin.de
Wed, 7 Mar 2001 23:15:17 +0100
> rfc 1738 states:
>
> A file URL takes the form:
> file://<host>/<path>
> where <host> is the fully qualified domain name of the system on
> which the <path> is accessible, and <path> is a hierarchical
> directory path of the form <directory>/<directory>/.../<name>.
> [...]
> As a special case, <host> can be the string "localhost" or the empty
> string; this is interpreted as `the machine from which the URL is
> being interpreted'.
>
>
> So this would mean that if localhost is implied, all file urls should have (at least) three slashes.
> Assuming that the rfc means that the "/" is purely syntactic, what you should expect to work is:
> file:////etc/passwd (4 slashes, because of the leading "/")
> file:///c:\autoexec.bat
> file:///\\drv\autoexec.bat
> file://///drv/autoexec.bat (5 slashes, since forward slashes work on win32 too)
That clearly is not the intention of the RFC. It "essentially" says
that <path> is a slash-separated list of directories, forming a
hierarchy; ie. the intention is that it does not start with a
slash. So /etc/passwd clearly is
file:///etc/passwd
It then gives the example of a VMS file name
DISK$USER:[MY.NOTES]NOTE123456.TXT, saying that it might become (*)
file://vms.host.edu/disk$user/my/notes/note12345.txt. So the intention
clearly is that hierarchy is presented using /. Apparently,
translation between a file name and a <path> is meant to be executed
in a system-dependent manner, but many systems failed to define a
procedure for doing so. Considering that one needs to distinguish the
drv case, the logical form would be
file://C:/autoexec.bat
Regards,
Martin
(*) The 'might' probably refers to the fact that the URL introduces
vms.host.edu, which was not mentioned before.