[Python-ideas] URLs/URIs + pathlib.Path + literal syntax = ?

Chris Angelico rosuav at gmail.com
Wed Mar 30 00:23:32 EDT 2016


On Wed, Mar 30, 2016 at 3:06 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Chris Angelico writes:
>
>  > 3) Path("file://http://www.example.com")
>  >
>  > For scripts that need 100% dependable parsing, the third option will
>  > be guaranteed to work.
>
> No, the third should crap out with a syntax error on the colon...
>
> The correct syntaxes per [1] and RFC 3986 are
>
> 4)  Path("file:///http://www.example.com")

Oops, my bad - I forgot about the third slash. It comes to the same
thing, though; for most paths, you can deduce that a prefix "http://"
implies that it's not a file path, and for the rare case when you do
mean that, you can explicitly adorn it.

(This is a good reminder that code and specs should not be created by
one person firing off an email. This needs someone - preferably
multiple people - checking the appropriate specs. Get it right, folks,
don't trust me!)

> There's an additional problem with trying to cram URIs and Path
> together, which is that in a file system, "/a/b/symlink/../c" may
> refer to any file system object depending on symlink's target which is
> unknown, while as an URI path it refers to whatever "/a/b/c" refers
> to, and nothing else.  (This is the semantic glitch I was thinking of
> earlier.)
>
> This means that URIs can be canonicalized syntactically, while doing
> so with file system paths is risky.

Or there are two operations: canonicalizing by components, and
rendering a "true path", which requires file system access (stat every
component).

ChrisA


More information about the Python-ideas mailing list