Chris Angelico writes:
1) Path("./http://www.example.com") 2) Path("http:/www.example.com") 3) Path("file://http://www.example.com")
For scripts that need 100% dependable parsing, the third option will be guaranteed to work.
No, the third should crap out with a syntax error on the colon, see [1], which does not allow a port spec at all, and RFC 3986, which doesn't allow colon in the host name ([1] references RFC 3986 for the syntax of the host name). Specifying the host to a "file:" URI gives locally-defined behavior (eg, a Windows share), but in the most recent attempt to deal with exactly these issues[1], it is legal. The correct syntaxes per [1] and RFC 3986 are 4) Path("file:///http://www.example.com") 5) Path("file://localhost/http://www.example.com") 6) Path("file://[127.0.0.1]/http://www.example.com") 7) Path("file://[::1]/http://www.example.com") As far as I can tell the colon in "http:" is RFC 3986-legal, since it has no URI syntactic meaning in the path component. This isn't as easy as it looks (which is why people are trying to delegate it to something they think of as "simple"). There's an additional problem with trying to cram URIs and Path together, which is that in a file system, "/a/b/symlink/../c" may refer to any file system object depending on symlink's target which is unknown, while as an URI path it refers to whatever "/a/b/c" refers to, and nothing else. (This is the semantic glitch I was thinking of earlier.) This means that URIs can be canonicalized syntactically, while doing so with file system paths is risky. Footnotes: [1] https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-06