On Wed, Mar 30, 2016 at 3:06 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Chris Angelico writes:
3) Path("file://http://www.example.com")
For scripts that need 100% dependable parsing, the third option will be guaranteed to work.
No, the third should crap out with a syntax error on the colon...
The correct syntaxes per [1] and RFC 3986 are
4) Path("file:///http://www.example.com")
Oops, my bad - I forgot about the third slash. It comes to the same thing, though; for most paths, you can deduce that a prefix "http://" implies that it's not a file path, and for the rare case when you do mean that, you can explicitly adorn it. (This is a good reminder that code and specs should not be created by one person firing off an email. This needs someone - preferably multiple people - checking the appropriate specs. Get it right, folks, don't trust me!)
There's an additional problem with trying to cram URIs and Path together, which is that in a file system, "/a/b/symlink/../c" may refer to any file system object depending on symlink's target which is unknown, while as an URI path it refers to whatever "/a/b/c" refers to, and nothing else. (This is the semantic glitch I was thinking of earlier.)
This means that URIs can be canonicalized syntactically, while doing so with file system paths is risky.
Or there are two operations: canonicalizing by components, and rendering a "true path", which requires file system access (stat every component). ChrisA