Fwd: I was just thinking that os.path could use some love...
Thoughts on os.path? What happened to the idea of a new path object?
--Guido
---------- Forwarded message ----------
From: Talin
Hmm... Mind if I just forward to python-dev? IIRC there's been some discussion about a newfangled path object, but I lost track of where the discussion went. (Or you can post there yourself. :-)
Sure - forward away. Is this the same as the path object idea that was floated 5 years ago? I've yet to see any set of convenience methods for paths that are so compelling as to be worth all of the time and energy needed to update all of the various APIs which now expect paths to be passed in as strings.
On Wed, Jan 30, 2013 at 11:13 AM, Talin
wrote: I just realized that os.path hasn't changed in a long time. Here's a couple of ideas for additions:
os.path.splitall(path) - splits all path components into a tuple - so for example, 'this/is/a/path' turns into ('this', 'is', 'a', 'path'). If there's a trailing slash, the last item in the tuple will be a zero-length string. The main reason for having this in os.path is so that we can remain separator-character-agnostic.
Would it also return a leading empty string if the path starts with /? What about Windows C: or //host/ prefixes???
I would say that it should only split on directory separators, not any other kind of delimiter, so that each component is a valid filesystem identifier. Further, the operation should be reversible by calling os.path.join(*dirnames). So it's a little more complex than just string.split('/'). Part of the reason for wanting the splitall function is to implement the common prefix function - you take all the paths, bust them into tuples, look for the longest tuple prefix, and then join the result back into a path. This means that os.path.join(*os.path.splitall(path)) must reproduce the original path exactly.
An alternative would be to add an optional 'maxsplits' parameter to os.path.split. Default would be 1; 0 = unlimited. Main disadvantage would be that no one would ever use values 2, 3, etc... but it would be more compatible with other split apis.
A version of os.path.commonprefix that always produces valid paths (and works by path component, not character-by-character).
I can probably think of others if I look at some other path manipulation apis. llvm::sys::path has a bunch if I recall...
-- -- Talin
-- --Guido van Rossum (python.org/~guido)
-- -- Talin -- --Guido van Rossum (python.org/~guido)
Thoughts on os.path? What happened to the idea of a new path object?
I don't know if it's related, but there are two new interesting projects: pathlib and walkdir. http://pypi.python.org/pypi/pathlib https://pathlib.readthedocs.org/en/latest/ http://pypi.python.org/pypi/walkdir http://walkdir.readthedocs.org/en/latest/ http://bugs.python.org/issue13229 Victor
http://bugs.python.org/issue11344 evolved into a patch for 'splitpath',
similar to splitall. Antoine's pathlib (PEP 428) is also mentioned
at the end, which is probably what Guido is thinking of.
--David
On Wed, 30 Jan 2013 13:26:08 -0800, Guido van Rossum
Thoughts on os.path? What happened to the idea of a new path object?
--Guido
---------- Forwarded message ---------- From: Talin
Date: Wed, Jan 30, 2013 at 12:34 PM Subject: Re: I was just thinking that os.path could use some love... To: Guido van Rossum On Wed, Jan 30, 2013 at 11:33 AM, Guido van Rossum
wrote: Hmm... Mind if I just forward to python-dev? IIRC there's been some discussion about a newfangled path object, but I lost track of where the discussion went. (Or you can post there yourself. :-)
Sure - forward away. Is this the same as the path object idea that was floated 5 years ago? I've yet to see any set of convenience methods for paths that are so compelling as to be worth all of the time and energy needed to update all of the various APIs which now expect paths to be passed in as strings.
On Wed, Jan 30, 2013 at 11:13 AM, Talin
wrote: I just realized that os.path hasn't changed in a long time. Here's a couple of ideas for additions:
os.path.splitall(path) - splits all path components into a tuple - so for example, 'this/is/a/path' turns into ('this', 'is', 'a', 'path'). If there's a trailing slash, the last item in the tuple will be a zero-length string. The main reason for having this in os.path is so that we can remain separator-character-agnostic.
Would it also return a leading empty string if the path starts with /? What about Windows C: or //host/ prefixes???
I would say that it should only split on directory separators, not any other kind of delimiter, so that each component is a valid filesystem identifier. Further, the operation should be reversible by calling os.path.join(*dirnames). So it's a little more complex than just string.split('/').
Part of the reason for wanting the splitall function is to implement the common prefix function - you take all the paths, bust them into tuples, look for the longest tuple prefix, and then join the result back into a path. This means that os.path.join(*os.path.splitall(path)) must reproduce the original path exactly.
An alternative would be to add an optional 'maxsplits' parameter to os.path.split. Default would be 1; 0 = unlimited. Main disadvantage would be that no one would ever use values 2, 3, etc... but it would be more compatible with other split apis.
A version of os.path.commonprefix that always produces valid paths (and works by path component, not character-by-character).
I can probably think of others if I look at some other path manipulation apis. llvm::sys::path has a bunch if I recall...
-- -- Talin
-- --Guido van Rossum (python.org/~guido)
-- -- Talin
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/rdmurray%40bitdance.com
Thoughts on os.path? What happened to the idea of a new path object?
You old pot stirrer! <wink> I wonder about this from time-to-time as well, so was just interested enough to wander over to PyPI and see what I could dig up. There are a couple path module/package implementations on PyPI. Nothing jumped out at me as the One True Way. I think you are probably thinking of Jason Orendorff's path.py module: http://pypi.python.org/pypi/path.py There are also two PEPs: http://www.python.org/dev/peps/pep-0355/ http://www.python.org/dev/peps/pep-0428/ The former was rejected. The latter still seems to have legs, and an implementation: http://pypi.python.org/pypi/pathlib/0.7 Skip
On 30Jan2013 13:26, Guido van Rossum
On Jan 30, 2013, at 2:01 PM, Cameron Simpson
Speaking for myself, I've been having some usefulness with making "URL" objects that are subclasses of str. That lets me pass them to all the things that already expect strs, while still having convenience methods.
str subclasses are problematic. One issue is that it will still allow for invalid manipulations. If you prohibit them, then manipulations that take multiple steps will be super inconvenient. If you allow them, then you end up with half-formed values that will error out sometimes, or generate corrupt data that shouldn't be allowed to exist (trivial example; a NUL character in the middle of a file path). Also, automatic coercion will sometimes surprise you and give you a value which is of the wrong type if you forget a method or two. Also URL and file paths have a common interface, but are not totally the same. Basically, everybody wants to say "composition is better than inheritance, except for *this* case, where inheritance seems super convenient". That's how it gets you! Inheritance _is_ super convenient, but it's also super confusing. Resist the temptation :-). Once again (I see my previous reply went straight to the sender, not the whole list) I recommend https://launchpad.net/filepath as an abstraction that has worked very well in a wide variety of situations. -glyph
Le Wed, 30 Jan 2013 13:26:08 -0800,
Guido van Rossum
Thoughts on os.path? What happened to the idea of a new path object?
I plan to launch another round of discussions following the changes in PEP 428. Regards Antoine.
--Guido
---------- Forwarded message ---------- From: Talin
Date: Wed, Jan 30, 2013 at 12:34 PM Subject: Re: I was just thinking that os.path could use some love... To: Guido van Rossum On Wed, Jan 30, 2013 at 11:33 AM, Guido van Rossum
wrote: Hmm... Mind if I just forward to python-dev? IIRC there's been some discussion about a newfangled path object, but I lost track of where the discussion went. (Or you can post there yourself. :-)
Sure - forward away. Is this the same as the path object idea that was floated 5 years ago? I've yet to see any set of convenience methods for paths that are so compelling as to be worth all of the time and energy needed to update all of the various APIs which now expect paths to be passed in as strings.
On Wed, Jan 30, 2013 at 11:13 AM, Talin
wrote: I just realized that os.path hasn't changed in a long time. Here's a couple of ideas for additions:
os.path.splitall(path) - splits all path components into a tuple - so for example, 'this/is/a/path' turns into ('this', 'is', 'a', 'path'). If there's a trailing slash, the last item in the tuple will be a zero-length string. The main reason for having this in os.path is so that we can remain separator-character-agnostic.
Would it also return a leading empty string if the path starts with /? What about Windows C: or //host/ prefixes???
I would say that it should only split on directory separators, not any other kind of delimiter, so that each component is a valid filesystem identifier. Further, the operation should be reversible by calling os.path.join(*dirnames). So it's a little more complex than just string.split('/').
Part of the reason for wanting the splitall function is to implement the common prefix function - you take all the paths, bust them into tuples, look for the longest tuple prefix, and then join the result back into a path. This means that os.path.join(*os.path.splitall(path)) must reproduce the original path exactly.
An alternative would be to add an optional 'maxsplits' parameter to os.path.split. Default would be 1; 0 = unlimited. Main disadvantage would be that no one would ever use values 2, 3, etc... but it would be more compatible with other split apis.
A version of os.path.commonprefix that always produces valid paths (and works by path component, not character-by-character).
I can probably think of others if I look at some other path manipulation apis. llvm::sys::path has a bunch if I recall...
-- -- Talin
-- --Guido van Rossum (python.org/~guido)
-- -- Talin
participants (7)
-
Antoine Pitrou
-
Cameron Simpson
-
Glyph
-
Guido van Rossum
-
R. David Murray
-
Skip Montanaro
-
Victor Stinner