An option to force the path separator for the "os.path.join()" method.
I'd like to have an option to force the path separator for the "os.path.join()" method. E.g. if I run the script on Windows, but I generate, say, an URL, I'd find it convenient to use the same method, but with an explicit flag to "join" with the forward slash (because URLs use it). Currently I simply use an f-string to combine a path, like e.g.: path = f"{root}/{dir}" but it will insert extra "/" if there if "root" already has "/" on the end. But the "join" method will not, which is a pro for the method vs "manual" string construct. I know there is the "pathlib" module with all conversion methods, but it's overkill for many tasks. So I'd rather like to have an option to write for example: path = os.path.join (root, dir, sep = "posix") So that it joins with the forward slash even if run on Windows. Also it seems that e.g. "os.path.split()" already supports both separators, so I think this addition won't bring inconsistency into the module and its usage.
On 2021-01-06 at 07:07:30 +0300, Mikhail V <mikhailwas@gmail.com> wrote:
I'd like to have an option to force the path separator for the "os.path.join()" method. E.g. if I run the script on Windows, but I generate, say, an URL, I'd find it convenient to use the same method, but with an explicit flag to "join" with the forward slash (because URLs use it). Currently I simply use an f-string to combine a path, like e.g.:
path = f"{root}/{dir}"
but it will insert extra "/" if there if "root" already has "/" on the end. But the "join" method will not, which is a pro for the method vs "manual" string construct.
I know there is the "pathlib" module with all conversion methods, but it's overkill for many tasks. So I'd rather like to have an option to write for example:
path = os.path.join (root, dir, sep = "posix")
So that it joins with the forward slash even if run on Windows. Also it seems that e.g. "os.path.split()" already supports both separators, so I think this addition won't bring inconsistency into the module and its usage.
I'm not sure whether a method in the os module should understand URLs (are URLs part of the OS, or part of something else?), but in any case, in your example, it would make more sense to have a "URL" separator and not use "posix" as a simple synonym for forward slash. IMO, using "/" itself is clearer. That said, AIUI, there's nothing stopping a web server from using whatever separators it wants. Everything after the domain name is up to the web server to interpret; it just happens that most early web servers ran on Unix/Posix/Linux boxes and mapped URLs fairly directly to parts of the file system, and "/" was more natural than anything else.
On 05Jan2021 22:41, Dan Sommers <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
That said, AIUI, there's nothing stopping a web server from using whatever separators it wants. Everything after the domain name is up to the web server to interpret; it just happens that most early web servers ran on Unix/Posix/Linux boxes and mapped URLs fairly directly to parts of the file system, and "/" was more natural than anything else.
Well, not really. On the server side something has to map the URL local part to a file path or some handler, and that could use whatever separators it likes. However, HTML relative URLs rely on '/' as a separator to resolve correctly. You can't change that without breaking the text web because relative URLs are resolved by browsers, not servers. Cheers, Cameron Simpson <cs@cskk.id.au>
On 2021-01-05 20:07, Mikhail V wrote:
I'd like to have an option to force the path separator for the "os.path.join()" method.
The urljoin method was made for the URL use case: from urllib.parse import urljoin urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html') 'http://www.cwi.nl/%7Eguido/FAQ.html' From: https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urljoin -Mike
On 06Jan2021 07:07, Mikhail V <mikhailwas@gmail.com> wrote:
I'd like to have an option to force the path separator for the "os.path.join()" method. E.g. if I run the script on Windows, but I generate, say, an URL, I'd find it convenient to use the same method, but with an explicit flag to "join" with the forward slash (because URLs use it). Currently I simply use an f-string to combine a path, like e.g.:
path = f"{root}/{dir}"
but it will insert extra "/" if there if "root" already has "/" on the end. But the "join" method will not, which is a pro for the method vs "manual" string construct.
There's also: '/'.join(url-path-things-here...)
I know there is the "pathlib" module with all conversion methods, but it's overkill for many tasks.
I suspect given pathlib's existence, that raises the bar for wanting to extend os.path.join since there are so many other ways to do that. Also, os.path.join is supposed to be for the local OS, like most other os.* things. Making it _not_ act like the local OS feels antithetical to me. I was going to suggest: https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse since your use case is URLs, but it doesn't really parse the "path" part. Cheers, Cameron Simpson <cs@cskk.id.au>
On Wed, Jan 06, 2021 at 07:07:30AM +0300, Mikhail V wrote:
I'd like to have an option to force the path separator for the "os.path.join()" method. E.g. if I run the script on Windows, but I generate, say, an URL, I'd find it convenient to use the same method, but with an explicit flag to "join" with the forward slash (because URLs use it).
"I don't care about correctness, I want to add arbitrary functionality to unrelated functions because it's convenient." *wink* The whole point of using os.path.join is that you don't care what the separator is, so long as it is correct *for the OS at runtime*. But for URLs, the separator never depends on the runtime OS. It is either fixed, or at worst will depend on the protocol. URLs are also a lot more complicated than file paths, so you should be using urllib to assemble the parts: https://docs.python.org/3/library/urllib.parse.html If you can't be bothered (and let's be honest, we've all been there) there's nothing wrong with assembling URL path components with pure string operations. All you need is a one-liner: def join(*parts, sep='/'): return sep.join([part.strip(sep) for part in parts]) -- Steve
Steven D'Aprano writes:
URLs are also a lot more complicated than file paths,
It may be just me, but I would say the opposite: URLs are simpler because they follow unambiguous rules. There is no "realpath" for URLs, they're WYSIWYG. "." and ".." have unambiguous semantics in URLs[1], which are implementable as string transformations.
so you should be using urllib to assemble the parts:
Following the rules is not entirely trivial, though, so +1000 here. Steve Footnotes: [1] Up to server implementation, but that's explicitly out of scope when you're talking about the URL itself.
06.01.21 06:07, Mikhail V пише:
I know there is the "pathlib" module with all conversion methods, but it's overkill for many tasks. So I'd rather like to have an option to write for example:
path = os.path.join (root, dir, sep = "posix")
So that it joins with the forward slash even if run on Windows. Also it seems that e.g. "os.path.split()" already supports both separators, so I think this addition won't bring inconsistency into the module and its usage.
Use posixpath.
participants (7)
-
2QdxY4RzWzUUiLuE@potatochowder.com
-
Cameron Simpson
-
Mike Miller
-
Mikhail V
-
Serhiy Storchaka
-
Stephen J. Turnbull
-
Steven D'Aprano