When should pathlib stop being provisional?

After a rather extensive discussion on python-ideas about pathlib.PurePath not inheriting from str, another point that came up was that the use of pathlib has been rather light. Unfortunately even the stdlib doesn't really use pathlib because it's currently marked as provisional (or at least that's why I haven't tried to use it where possible in importlib). Do we have a plan of what is required to remove the provisional label from pathlib?

I think the provisional status can be safely lifted now. Even though pathlib hasn't seen that much use, there have been enough reports and discussion since its acception that I think the API has proven it's sane for general use. (as for importlib, pathlib might have too many dependencies for sane bootstrapping) Regards Antoine. Le 06/04/2016 00:41, Brett Cannon a écrit :
After a rather extensive discussion on python-ideas about pathlib.PurePath not inheriting from str, another point that came up was that the use of pathlib has been rather light. Unfortunately even the stdlib doesn't really use pathlib because it's currently marked as provisional (or at least that's why I haven't tried to use it where possible in importlib).
Do we have a plan of what is required to remove the provisional label from pathlib?

It's been provisional since 3.4. I think if it is still there in 3.6.0 it should be considered no longer provisional. But this may indeed be a test case for the ultimate fate of provisional modules -- should we remove it? I have to admit I got tired of the discussions and muted them all. Personally I am not worried about the light use (I always expected it would take a long time to get adoption) but I am worried about the hostility towards the module. My last/only comment in the discussion was about there possibly being a dichotomy between people who use Python for scripting and those who use it to write more substantial programs (I'm trying not to judge one group more important than another -- I'm just observing there seem to be these two groups). But I didn't stick around long enough to watch for responses to this idea. Would making it inherit from str cause most hostility to disappear? I'm sure there was a discussion about this when PEP 428 was originally proposed, and I recall I was strongly in the camp of "it should not inherit from str", but unfortunately the PEP has no mention of this discussion or even the stated reason. --Guido On Tue, Apr 5, 2016 at 3:41 PM, Brett Cannon <brett@python.org> wrote:
After a rather extensive discussion on python-ideas about pathlib.PurePath not inheriting from str, another point that came up was that the use of pathlib has been rather light. Unfortunately even the stdlib doesn't really use pathlib because it's currently marked as provisional (or at least that's why I haven't tried to use it where possible in importlib).
Do we have a plan of what is required to remove the provisional label from pathlib?
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)

On 4/5/2016 18:55, Guido van Rossum wrote:
My last/only comment in the discussion was about there possibly being a dichotomy between people who use Python for scripting and those who use it to write more substantial programs (I'm trying not to judge one group more important than another -- I'm just observing there seem to be these two groups). But I didn't stick around long enough to watch for responses to this idea. This was all but ignored.
The opinions mentioned in the thread, without throwing my opinion behind any of them were: * pathlib should be improved (specifically by making it inherit from str) * the stdlib should be made to deal with pathlib without changing pathlib * pathlib is redundant to third party modules which work better * the continued existence of pathlib was briefly discussed You can insert the never-ending arguments for and against each of those points in your head - none of them were particularly convincing (in that i don't think anyone changed their position.) the split between utility scripting and application development was not really discussed.

On Wed, Apr 6, 2016 at 9:08 AM, Alexander Walters <tritium-list@sdamon.com> wrote:
* pathlib should be improved (specifically by making it inherit from str)
I'd like to see this specific change settled on in the PEP, actually. There are some arguments on both sides, and some hybrid solutions being proposed, and it looks to be an important enough issue to people for there to be an answer somewhere. It seems to come down to a sloppiness vs strictness concern, I think, but I'm not sure. ChrisA

On Tue, Apr 5, 2016 at 4:13 PM, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Apr 6, 2016 at 9:08 AM, Alexander Walters <tritium-list@sdamon.com> wrote:
* pathlib should be improved (specifically by making it inherit from str)
I'd like to see this specific change settled on in the PEP, actually. There are some arguments on both sides, and some hybrid solutions being proposed, and it looks to be an important enough issue to people for there to be an answer somewhere. It seems to come down to a sloppiness vs strictness concern, I think, but I'm not sure.
This does sound like it's the crucial issue, and it is worth writing up clearly the pros and cons. Let's draft those lists in a thread (this one's fine) and then add them to the PEP. We can then decide to: - keep the status quo - change PurePath to inherit from str - decide it's never going to be settled and kill pathlib.py (And yes, I'm dead serious about the latter, rather Solomonic option.) -- --Guido van Rossum (python.org/~guido)

On Wed, Apr 6, 2016 at 9:45 AM, Guido van Rossum <guido@python.org> wrote:
On Tue, Apr 5, 2016 at 4:13 PM, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Apr 6, 2016 at 9:08 AM, Alexander Walters <tritium-list@sdamon.com> wrote:
* pathlib should be improved (specifically by making it inherit from str)
I'd like to see this specific change settled on in the PEP, actually. There are some arguments on both sides, and some hybrid solutions being proposed, and it looks to be an important enough issue to people for there to be an answer somewhere. It seems to come down to a sloppiness vs strictness concern, I think, but I'm not sure.
This does sound like it's the crucial issue, and it is worth writing up clearly the pros and cons. Let's draft those lists in a thread (this one's fine) and then add them to the PEP. We can then decide to:
- keep the status quo - change PurePath to inherit from str - decide it's never going to be settled and kill pathlib.py
(And yes, I'm dead serious about the latter, rather Solomonic option.)
Summarizing from memory to get things started. Inheriting from str makes it easier for code to support pathlib without really caring about the details. NOT inheriting from str forces code to be aware that it's working with a path, in the same way that text and bytes are fundamentally different things, and the Unicode string doesn't inherit from the byte string, nor vice versa. If a few crucial built-in functions support Path objects (notably open() and a handful of os.* functions), the bulk of stdlib support will be easy (sometimes trivial) to implement. Paths are [or are not] fundamentally different from strings. <-- argued point Paths might be backed by Unicode text, and might be backed by bytes. Should a Path be able to be implicitly constructed from either? Should there be some sort of "Path literal"? <-- possibly a completely separate question, to be resolved after this one How should .. be handled? Can you canonicalize a Path? Can Path handle URIs as well as file system paths? ----- My personal view on the text/bytes debate is that a path is fundamentally a human concept, and consists therefore of text. The fact that some file systems store (at the low level) bytes and some store (I think) UTF-16 code units should be immaterial; path components exist for people. We can smuggle unrecognized bytes around, but ultimately, those bytes came from characters at some point - we just don't know the encoding. So a Path object has no relationship with bytes, only with str. Whether a Path is fundamentally "a text string that uses slashes to separate components" or "a tuple of path components" is up for debate. Both make a lot of sense, and I'm somewhat inclined to the latter view; it allows for other forms of path component, such as an open directory (for statat/openat etc), or a special thing representing "current directory" or "root directory". ChrisA

On Wed, Apr 06, 2016 at 10:02:30AM +1000, Chris Angelico wrote:
My personal view on the text/bytes debate is that a path is fundamentally a human concept, and consists therefore of text. The fact that some file systems store (at the low level) bytes and some store (I think) UTF-16 code units should be immaterial; path components exist for people. We can smuggle unrecognized bytes around, but ultimately, those bytes came from characters at some point - we just don't know the encoding. So a Path object has no relationship with bytes, only with str.
That might be usually true in practice, but it is incorrect in principle. Paths in POSIX systems like Linux are fundamentally byte-strings with only two restrictions: \0 and \x2f are forbidden. The fact that paths in Linux mostly happen to look like English words (often heavily abbreviated) is a historical accident. The file system itself supported paths containing (say) \xff even back in the days when text was pure US-ASCII and bytes over \x7f had no textual meaning, and these days paths still support sequences of bytes that have no human meaning in any encoding. I don't know if this makes the tiniest lick of difference for Pathlib. I would be perfectly content if we stuck with the design decision that Pathlib can only represent paths representable as Unicode strings, and left weird POSIX filenames to the legacy byte-string interface. -- Steve

On Wed, Apr 6, 2016 at 12:51 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Wed, Apr 06, 2016 at 10:02:30AM +1000, Chris Angelico wrote:
My personal view on the text/bytes debate is that a path is fundamentally a human concept, and consists therefore of text. The fact that some file systems store (at the low level) bytes and some store (I think) UTF-16 code units should be immaterial; path components exist for people. We can smuggle unrecognized bytes around, but ultimately, those bytes came from characters at some point - we just don't know the encoding. So a Path object has no relationship with bytes, only with str.
That might be usually true in practice, but it is incorrect in principle. Paths in POSIX systems like Linux are fundamentally byte-strings with only two restrictions: \0 and \x2f are forbidden.
That's the file system level. But more fundamentally than that, a path exists so that humans can refer to files. That's why they have *names*, not just dirent numbers. We could assign dirent number -1 to mean "parent directory", and then represent everything with tuples of directory entries. Follow the chain and you get an inode. Absolute paths would start with an inode (the root directory being inode 2) and proceed with dirents thereafter. Maybe we'd need a pseudo-inode to mean "current directory". Should we do paths like this? No way! Much better to have either "/home/rosuav/cpython/python" or (P.ROOT, "home", "rosuav", "cpython", "python") to represent them, because they exist for the human. The POSIX file system rules aren't insignificant, but my point is that every byte value seen in a file name was once representing a character. Outside of deliberate tests, we don't create files on our disks whose names are strings of random bytes; the normal use of a file system is to store files that a human has named. Hence my recommendation that a Path object be tied to str, but *not* to bytes.
The fact that paths in Linux mostly happen to look like English words (often heavily abbreviated) is a historical accident. The file system itself supported paths containing (say) \xff even back in the days when text was pure US-ASCII and bytes over \x7f had no textual meaning, and these days paths still support sequences of bytes that have no human meaning in any encoding.
I don't know if this makes the tiniest lick of difference for Pathlib. I would be perfectly content if we stuck with the design decision that Pathlib can only represent paths representable as Unicode strings, and left weird POSIX filenames to the legacy byte-string interface.
I'd prefer to keep the surrogateescape compatibility hack with U+DC00 to U+DCFF being used to smuggle bytes around. That means that every path can be represented as a Unicode string, with only minor loss of functionality (imagine a path with only a single character that can't be decoded - chances are a human can figure out what the file is), but it still strongly pushes to a Unicode interpretation of the path. An *actual* byte-string interface (such as os.listdir and friends support) would be completely outside of anything involving Pathlib. If you give bytes, you'll get bytes. And I'd deprecate that once Path objects are more broadly accepted. ChrisA

Chris Angelico writes:
Outside of deliberate tests, we don't create files on our disks whose names are strings of random bytes;
Wishful thinking. First, names made of control characters have often been deliberately used by miscreants to conceal their warez. Second, in some systems it's all too easy to create paths with components in different locales (the place I've seen it most frequently is in NFS mounts). I think that's much less true today, but perhaps that's only because my employer figured out that it was much less pain if system paths were pure ASCII so that it mostly didn't matter what encoding users chose for their subtrees. It remains important to be able to handle nearly arbitrary bytestrings in file names as far as I can see. Please note that 100 million Japanese and 1 billion Chinese by and large still prefer their homegrown encodings (plural!!) to Unicode, while many systems are now defaulting filenames to UTF-8. There's plenty of room remaining for copying bytestrings to arguments of open and friends.

On Wed, Apr 6, 2016 at 3:37 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Chris Angelico writes:
Outside of deliberate tests, we don't create files on our disks whose names are strings of random bytes;
Wishful thinking. First, names made of control characters have often been deliberately used by miscreants to conceal their warez. Second, in some systems it's all too easy to create paths with components in different locales (the place I've seen it most frequently is in NFS mounts). I think that's much less true today, but perhaps that's only because my employer figured out that it was much less pain if system paths were pure ASCII so that it mostly didn't matter what encoding users chose for their subtrees.
Control characters are still characters, though. You can take a bytestring consisting of byte values less than 32, decode it as UTF-8, and have a series of codepoints to work with. If your employer has "solved" the problem by restricting system paths to ASCII, that's a fine solution for a single system with a single ASCII-compatible encoding; a better solution is to mandate UTF-8 as the file system encoding, as that's what most people are expecting anyway.
It remains important to be able to handle nearly arbitrary bytestrings in file names as far as I can see. Please note that 100 million Japanese and 1 billion Chinese by and large still prefer their homegrown encodings (plural!!) to Unicode, while many systems are now defaulting filenames to UTF-8. There's plenty of room remaining for copying bytestrings to arguments of open and friends.
Why exactly do they prefer these other encodings? Are they representing characters that Unicode doesn't contain? If so, we have a fundamental problem (no Python program is going to be able to cope with these, without a third party library or some stupid mess of local code); if not, you can always represent it as Unicode and encode it as UTF-8 when it reaches the file system. Re-encoding is something that's easy when you treat something as text, and impossible when you treat it as bytes. So far, you're still actually agreeing with me: paths are *text*, but sometimes we don't know the encoding (and that's a problem to be solved). ChrisA

On 4/5/2016 7:45 PM, Guido van Rossum wrote:
This does sound like it's the crucial issue, and it is worth writing up clearly the pros and cons. Let's draft those lists in a thread (this one's fine) and then add them to the PEP. We can then decide to:
- keep the status quo - change PurePath to inherit from str - decide it's never going to be settled and kill pathlib.py
(And yes, I'm dead serious about the latter, rather Solomonic option.)
My sense of the discussion was that some people think that the new-in-upcoming 3.5.2 PurePath.path should serve as a substitute for inheriting from str. In particular, it should make it easy for stringpath functions to also accept path objects. -- Terry Jan Reedy

On 6 April 2016 at 09:45, Guido van Rossum <guido@python.org> wrote:
On Tue, Apr 5, 2016 at 4:13 PM, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Apr 6, 2016 at 9:08 AM, Alexander Walters <tritium-list@sdamon.com> wrote:
* pathlib should be improved (specifically by making it inherit from str)
I'd like to see this specific change settled on in the PEP, actually. There are some arguments on both sides, and some hybrid solutions being proposed, and it looks to be an important enough issue to people for there to be an answer somewhere. It seems to come down to a sloppiness vs strictness concern, I think, but I'm not sure.
This does sound like it's the crucial issue, and it is worth writing up clearly the pros and cons. Let's draft those lists in a thread (this one's fine) and then add them to the PEP. We can then decide to:
- keep the status quo - change PurePath to inherit from str - decide it's never going to be settled and kill pathlib.py
Option 4: define a rich-object-to-text path serialisation convention, as paths are not conceptually the same as arbitrary strings, and we can define a new protocol accepted by builtins and standard library modules, while third parties can't The most promising option for that is probably "getattr(path, 'path', path)", since the "path" attribute is being added to pathlib, and the given idiom can be readily adopted in Python 2/3 compatible code (since normal strings and any other object without a "path" attribute are passed through unchanged). Alternatively, since it's a protocol, double-underscores on the property name may be appropriate (i.e. "getattr(path, '__path__', path)") The next challenge would then be to make a list of APIs to be updated for 3.6 to implicitly accept "rich path" objects via the agreed convention, with pathlib.PurePath used as a test class: * open() * codecs.open() (et al) * io.* * os.path.* * other os functions * shutil.* * tempfile.* * shelve.* * csv.* The list wouldn't necessarily need to be 100% comprehensive (similar to the rollout of context management, "support rich path objects in API <X>" may appear as future RFEs), but it should be comprehensive enough for rich path objects to mostly "just work" with other APIs that aren't specifically limiting their inputs to str objects (although using lower level APIs may force a conversion to the lower level plain text representation as a side-effect). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Option 4: define a rich-object-to-text path serialisation convention, as paths are not conceptually the same as arbitrary strings Just as a nit to pick, it is perfectly acceptable for hypothetical path objects to raise when someone tries to shoehorn them into acting like arbitrary strings - open() will gladly halt and set fire if you try and
On 4/5/2016 22:44, Nick Coghlan wrote: pass the text of war and peace as an argument. I think the naysayers would be satisfied with an object that... while not str or bytes or a derived class of either... acted like str when it had to. Is that possible without deriving from str or bytes?

On 6 April 2016 at 13:06, Alexander Walters <tritium-list@sdamon.com> wrote:
I think the naysayers would be satisfied with an object that... while not str or bytes or a derived class of either... acted like str when it had to. Is that possible without deriving from str or bytes?
Only if the consuming code explicitly casts with "str()", and that's *too* permissive for most use cases (since __str__ and the __repr__ fallback are completely inappropriate as a "convert to a text representation of a filesystem path" command). A "__text__" protocol for non-lossy conversions to str would arguably be feasible, but its scope goes way beyond what's needed for a "rich path object" conversion protocol. Implementing that model in the general case would require something more akin to https://www.python.org/dev/peps/pep-0357/, which added __index__ as a guaranteed-non-lossy conversion from other types to a builtin integer, allowing non-builtin integers to accepted for things like slicing and sequence repetition, without inadvertently also accepting non-integral types like builtin floats. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Apr 5, 2016 at 7:44 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Option 4: define a rich-object-to-text path serialisation convention,
Unfortunately that sounds like a classic "serious programming" solution (objects, abstractions, serialization, all big important words :-). -- --Guido van Rossum (python.org/~guido)

On 6 April 2016 at 15:03, Guido van Rossum <guido@python.org> wrote:
On Tue, Apr 5, 2016 at 7:44 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Option 4: define a rich-object-to-text path serialisation convention,
Unfortunately that sounds like a classic "serious programming" solution (objects, abstractions, serialization, all big important words :-).
From a testing perspective, it would arguably make sense to tackle it as a separate "test_path_protocol" test case that checked pathlib compatibility with the APIs of interest, simply to avoid adding a
Yeah, my choice of phrasing made the idea sound more complicated than it is. The actual change would be to add the following to some Python standard library APIs that accept a filesystem path as an argument: arg = getattr(arg, "path", arg) and the C API based equivalent to some C modules. (With the main bike-sheddable part being whether to use the generic "path" or something more explicit like "__fspath__" for the property name, since pathlib can readily support either/both of them, and "__fspath__" would be in line with the "os.fsencode" and "os.fsdecode" abbreviations) The key goal of this approach would be to make it so that most third party libraries would "just work" with path objects if they were already using os.path and other standard library APIs for path manipulation (rather than using string methods directly), while still avoiding the type confusion that comes from inheriting directly from str. pathlib dependency to all those module tests. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 06.04.16 05:44, Nick Coghlan wrote:
The next challenge would then be to make a list of APIs to be updated for 3.6 to implicitly accept "rich path" objects via the agreed convention, with pathlib.PurePath used as a test class:
* open() * codecs.open() (et al) * io.* * os.path.* * other os functions * shutil.* * tempfile.* * shelve.* * csv.*
Not sure about os.path.*. The purpose of os.path module is manipulating string paths. From the perspective of pathlib it can look lower level. Supporting pathlib.Path will complicate and slow down os.path functions (they are already more complex and slow than were in Python 2). Since os.path functions often called several times in a loop, their performance is important. On other hand, some Path methods are more efficient than os.path functions, and Path specialized code at higher level can be more preferable.

On 04/05/2016 10:50 PM, Serhiy Storchaka wrote:
On 06.04.16 05:44, Nick Coghlan wrote:
The next challenge would then be to make a list of APIs to be updated for 3.6 to implicitly accept "rich path" objects via the agreed convention, with pathlib.PurePath used as a test class:
* open() * codecs.open() (et al) * io.* * os.path.* * other os functions * shutil.* * tempfile.* * shelve.* * csv.*
Not sure about os.path.*. The purpose of os.path module is manipulating string paths. From the perspective of pathlib it can look lower level.
The point is that a function that receives a "path" object (whether str or Path) shouldn't have to care: it should be able to call os.path.split on the thing it received and get back a usable answer. -- ~Ethan~

On 6 April 2016 at 16:25, Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/05/2016 10:50 PM, Serhiy Storchaka wrote:
On 06.04.16 05:44, Nick Coghlan wrote:
The next challenge would then be to make a list of APIs to be updated for 3.6 to implicitly accept "rich path" objects via the agreed convention, with pathlib.PurePath used as a test class:
* open() * codecs.open() (et al) * io.* * os.path.* * other os functions * shutil.* * tempfile.* * shelve.* * csv.*
Not sure about os.path.*. The purpose of os.path module is manipulating string paths. From the perspective of pathlib it can look lower level.
The point is that a function that receives a "path" object (whether str or Path) shouldn't have to care: it should be able to call os.path.split on the thing it received and get back a usable answer.
I actually think it makes sense to pursue this question in a test driven manner: create "test_pathlib_support" as a new test case, start passing pathlib.PurePath instances to a relatively high level API like shutil, and see what low level interfaces need to be updated accept filesystem path objects (in addition to strings) in order to make that work. If shutil can be updated to support pathlib with changes solely at at the io and os module layer, then that bodes well for transparently enabling support in 3rd party APIs as well. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Ethan Furman <ethan <at> stoneleaf.us> writes:
Not sure about os.path.*. The purpose of os.path module is manipulating string paths. From the perspective of pathlib it can look lower level.
The point is that a function that receives a "path" object (whether str or Path) shouldn't have to care: it should be able to call os.path.split on the thing it received and get back a usable answer.
pathlib should already replicate the useful parts of os.path. That was the design goal after all. So this is like saying you want a Python file or socket object to be accepted by os.read(). In the rare case where you want that, you call the .fileno() method explicitly. The equivalent for Path objects is to lookup the .path attribute explicitly. Regards Antoine.

On 04/06/2016 02:50 AM, Antoine Pitrou wrote:
Ethan Furman <ethan <at> stoneleaf.us> writes:
Not sure about os.path.*. The purpose of os.path module is manipulating string paths. From the perspective of pathlib it can look lower level.
The point is that a function that receives a "path" object (whether str or Path) shouldn't have to care: it should be able to call os.path.split on the thing it received and get back a usable answer.
pathlib should already replicate the useful parts of os.path. That was the design goal after all.
Yes it does, and very well.
So this is like saying you want a Python file or socket object to be accepted by os.read(). In the rare case where you want that, you call the .fileno() method explicitly. The equivalent for Path objects is to lookup the .path attribute explicitly.
Unfortunately for Path objects there is already a well-established ecosystem for dealing with paths as strings, and it currently breaks when passed a Path path object. This is a high barrier to entry. Having the stdlib support Path objects would lower that barrier significantly. -- ~Ethan~

On 6 April 2016 at 15:59, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 06.04.16 08:52, Greg Ewing wrote:
Nick Coghlan wrote:
The most promising option for that is probably "getattr(path, 'path', path)",
Is there something seriously wrong with str(path)?
What if path is None or bytes?
Or an int, float, list, dict, or arbitrary other object. To be more explicit, the problem isn't what happens when the API doing "str(path)" internally is used correctly, it's what happens when it's used incorrectly: you end up proceeding with a nonsense string as your path name, rather than failing early with TypeError or AttributeError. Doing "getattr(path, 'path', path)" instead means that in the error case (i.e. no "path" attribute), any existing argument checking is still triggered normally. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 06.04.16 05:44, Nick Coghlan wrote:
The most promising option for that is probably "getattr(path, 'path', path)", since the "path" attribute is being added to pathlib, and the given idiom can be readily adopted in Python 2/3 compatible code (since normal strings and any other object without a "path" attribute are passed through unchanged). Alternatively, since it's a protocol, double-underscores on the property name may be appropriate (i.e. "getattr(path, '__path__', path)")
This was already discussed. Current conclusion is using the "path" attribute. See http://bugs.python.org/issue22570 .

On 6 April 2016 at 15:57, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 06.04.16 05:44, Nick Coghlan wrote:
The most promising option for that is probably "getattr(path, 'path', path)", since the "path" attribute is being added to pathlib, and the given idiom can be readily adopted in Python 2/3 compatible code (since normal strings and any other object without a "path" attribute are passed through unchanged). Alternatively, since it's a protocol, double-underscores on the property name may be appropriate (i.e. "getattr(path, '__path__', path)")
This was already discussed. Current conclusion is using the "path" attribute. See http://bugs.python.org/issue22570 .
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 6 April 2016 at 15:57, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 06.04.16 05:44, Nick Coghlan wrote:
The most promising option for that is probably "getattr(path, 'path', path)", since the "path" attribute is being added to pathlib, and the given idiom can be readily adopted in Python 2/3 compatible code (since normal strings and any other object without a "path" attribute are passed through unchanged). Alternatively, since it's a protocol, double-underscores on the property name may be appropriate (i.e. "getattr(path, '__path__', path)")
This was already discussed. Current conclusion is using the "path" attribute. See http://bugs.python.org/issue22570 .
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-). But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol. -n -- Nathaniel J. Smith -- https://vorpus.org

On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-)
But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
sys.path, for example. That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nick Coghlan <ncoghlan <at> gmail.com> writes:
sys.path, for example.
That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib.
That was my preference as well.
However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort.
Indeed. Regards Antoine.

On 04/05/2016 11:57 PM, Nick Coghlan wrote:
On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-)
But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
sys.path, for example.
That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on pathlib and scandir to add those attributes. -- ~Ethan~

WIth Ethan volunteering to do the work to help make a path protocol a thing -- and I'm willing to help along with propagating this through the stdlib where I think Serhiy might be interested in helping as well -- and a seeming consensus this is a good idea, it seems like this proposal has a chance of actually coming to fruition. Now we need clear details. :) Some open questions are: 1. Name: __path__, __fspath__, or something else? 2. Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build) 3. Built-in? (name is dependent on #1 if we add one) 4. Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it) 5. Expand the C API to have something like PyObject_Path()? Some people have asked for the pathlib PEP to have a more flushed out reasoning as to why pathlib doesn't inherit from str. If Antoine doesn't want to do it I can try to instil my blog post into a more succinct paragraph or two and update the PEP myself. Is this going to require a PEP or if we can agree on the points here are we just going to do it? If we think it requires a PEP I'm willing to write it, but I obviously have no issue if we skip that step either. :) Oh, and we should resolve this before the next release of Python 3.4, 3.5, or 3.6 so that pathlib can be updated in those releases. -Brett On Wed, 6 Apr 2016 at 08:09 Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/05/2016 11:57 PM, Nick Coghlan wrote:
On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-)
But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
sys.path, for example.
That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on pathlib and scandir to add those attributes.

Wouldn't be better to generalize that to a "__location__" protocol, which allow to return any kind of location, including path, url or coordinate, ip_address, etc ? Le 06/04/2016 19:26, Brett Cannon a écrit :
WIth Ethan volunteering to do the work to help make a path protocol a thing -- and I'm willing to help along with propagating this through the stdlib where I think Serhiy might be interested in helping as well -- and a seeming consensus this is a good idea, it seems like this proposal has a chance of actually coming to fruition.
Now we need clear details. :) Some open questions are:
1. Name: __path__, __fspath__, or something else? 2. Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build) 3. Built-in? (name is dependent on #1 if we add one) 4. Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it) 5. Expand the C API to have something like PyObject_Path()?
Some people have asked for the pathlib PEP to have a more flushed out reasoning as to why pathlib doesn't inherit from str. If Antoine doesn't want to do it I can try to instil my blog post into a more succinct paragraph or two and update the PEP myself.
Is this going to require a PEP or if we can agree on the points here are we just going to do it? If we think it requires a PEP I'm willing to write it, but I obviously have no issue if we skip that step either. :)
Oh, and we should resolve this before the next release of Python 3.4, 3.5, or 3.6 so that pathlib can be updated in those releases.
-Brett
On Wed, 6 Apr 2016 at 08:09 Ethan Furman <ethan@stoneleaf.us <mailto:ethan@stoneleaf.us>> wrote:
On 04/05/2016 11:57 PM, Nick Coghlan wrote: > On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com <mailto:njs@pobox.com>> wrote: >> On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com <mailto:ncoghlan@gmail.com>> wrote:
>>> I'd missed the existing precedent in DirEntry.path, so simply taking >>> that and running with it sounds good to me. >> >> This makes me twitch slightly, because NumPy has had a whole set of >> problems due to the ancient and minimally-considered decision to >> assume a bunch of ad hoc non-namespaced method names fulfilled some >> protocol -- like all .sum methods will have a signature that's >> compatible with numpy's, and if an object has a .log method then >> surely that computes the logarithm (what else in computing could "log" >> possibly refer to?), etc. This experience may or may not be relevant, >> I'm not sure -- sometimes these kinds of twitches are good guides to >> intuition, and sometimes they are just knee-jerk responses to an old >> and irrelevant problem :-) >> >> But you might want to at least think about >> how common it might be to have existing objects with unrelated >> attributes that happen to be called "path", and the bizarro problems >> that might be caused if someone accidentally passes one of them to a >> function that expects all .path attributes to be instances of this new >> protocol. > > sys.path, for example. > > That's why I'd actually prefer the implicit conversion protocol to be > the more explicitly named "__fspath__", with suitable "__fspath__ = > path" assignments added to DirEntry and pathlib. However, I'm also not > offering to actually *do* the work here, and the casting vote goes to > the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on pathlib and scandir to add those attributes.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/desmoulinmichel%40gmail.c...

On Wed, 6 Apr 2016 at 10:36 Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
Wouldn't be better to generalize that to a "__location__" protocol, which allow to return any kind of location, including path, url or coordinate, ip_address, etc ?
No because all of those things have different semantic meaning. See the __index__ PEP for reasons why you would tightly bound protocols instead of overloading ones like __int__ for multiple meanings. -Brett
WIth Ethan volunteering to do the work to help make a path protocol a thing -- and I'm willing to help along with propagating this through the stdlib where I think Serhiy might be interested in helping as well -- and a seeming consensus this is a good idea, it seems like this proposal has a chance of actually coming to fruition.
Now we need clear details. :) Some open questions are:
1. Name: __path__, __fspath__, or something else? 2. Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build) 3. Built-in? (name is dependent on #1 if we add one) 4. Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it) 5. Expand the C API to have something like PyObject_Path()?
Some people have asked for the pathlib PEP to have a more flushed out reasoning as to why pathlib doesn't inherit from str. If Antoine doesn't want to do it I can try to instil my blog post into a more succinct paragraph or two and update the PEP myself.
Is this going to require a PEP or if we can agree on the points here are we just going to do it? If we think it requires a PEP I'm willing to write it, but I obviously have no issue if we skip that step either. :)
Oh, and we should resolve this before the next release of Python 3.4, 3.5, or 3.6 so that pathlib can be updated in those releases.
-Brett
On Wed, 6 Apr 2016 at 08:09 Ethan Furman <ethan@stoneleaf.us <mailto:ethan@stoneleaf.us>> wrote:
On 04/05/2016 11:57 PM, Nick Coghlan wrote: > On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com <mailto:njs@pobox.com>> wrote: >> On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com <mailto:ncoghlan@gmail.com>> wrote:
>>> I'd missed the existing precedent in DirEntry.path, so simply taking >>> that and running with it sounds good to me. >> >> This makes me twitch slightly, because NumPy has had a whole set of >> problems due to the ancient and minimally-considered decision to >> assume a bunch of ad hoc non-namespaced method names fulfilled some >> protocol -- like all .sum methods will have a signature that's >> compatible with numpy's, and if an object has a .log method then >> surely that computes the logarithm (what else in computing could "log" >> possibly refer to?), etc. This experience may or may not be relevant, >> I'm not sure -- sometimes these kinds of twitches are good guides to >> intuition, and sometimes they are just knee-jerk responses to an
Le 06/04/2016 19:26, Brett Cannon a écrit : old
>> and irrelevant problem :-) >> >> But you might want to at least think about >> how common it might be to have existing objects with unrelated >> attributes that happen to be called "path", and the bizarro
problems
>> that might be caused if someone accidentally passes one of them
to a
>> function that expects all .path attributes to be instances of this new >> protocol. > > sys.path, for example. > > That's why I'd actually prefer the implicit conversion protocol to
be
> the more explicitly named "__fspath__", with suitable "__fspath__ = > path" assignments added to DirEntry and pathlib. However, I'm also
not
> offering to actually *do* the work here, and the casting vote goes
to
> the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on
pathlib
and scandir to add those attributes.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
https://mail.python.org/mailman/options/python-dev/desmoulinmichel%40gmail.c...
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

On Thu, Apr 7, 2016 at 2:41 AM, Brett Cannon <brett@python.org> wrote:
On Wed, 6 Apr 2016 at 10:36 Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
Wouldn't be better to generalize that to a "__location__" protocol, which allow to return any kind of location, including path, url or coordinate, ip_address, etc ?
No because all of those things have different semantic meaning. See the __index__ PEP for reasons why you would tightly bound protocols instead of overloading ones like __int__ for multiple meanings.
-Brett
https://www.python.org/dev/peps/pep-0357/
It is not possible to use the nb_int (and __int__ special method) for this purpose because that method is used to *coerce* objects to integers.
I feel adding protocol only for path is bit over engineering. So I'm -0.5 on adding __fspath__. I'm +1 on adding general protocol for *coerce to string* like __index__. +0.5 on inherit from str (and drop byte path support). -- INADA Naoki <songofacandy@gmail.com>

FYI, Ruby's Pathname class doesn't inherit String. http://ruby-doc.org/stdlib-2.1.0/libdoc/pathname/rdoc/Pathname.html Ruby has two "convert to string" method. `.to_s` is like `__str__`. `.to_str` is like `__index__` but for str. It is used for implicit conversion. File.open accepts any object implements `.to_str`.

On Thu, Apr 7, 2016 at 12:00 AM, INADA Naoki <songofacandy@gmail.com> wrote:
I feel adding protocol only for path is bit over engineering. So I'm -0.5 on adding __fspath__.
I'm +1 on adding general protocol for *coerce to string* like __index__.
isn't __str__ the protocol for "coerce to string" ? __index__ is a protocol for "coerce to an integer that can be used as an index", which is like __fspath__ would be "coerce to a string that can be used as a path" the whole point is that __str__ will "work" with virtually anything -- whether it can reasonably be used as a path or not. I'm not sure that's a problem, but if it is, then that's what this new protocol is trying to solve, just like __Index__ enforces that only things that are intended to be used as indexes will work. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Apr 7, 2016 10:00 AM, "Chris Barker" <chris.barker@noaa.gov> wrote:
On Thu, Apr 7, 2016 at 12:00 AM, INADA Naoki <songofacandy@gmail.com>
wrote:
I feel adding protocol only for path is bit over engineering. So I'm
-0.5 on adding __fspath__.
I'm +1 on adding general protocol for *coerce to string* like __index__.
isn't __str__ the protocol for "coerce to string" ?
__index__ is a protocol for "coerce to an integer that can be used as an index", which is like __fspath__ would be "coerce to a string that can be used as a path"
No, __index__ is the protocol for "do a safe coerce to integer". The name is misleading, but its use in non-indexing contexts is well established. E.g. " ab" * obj will return a string with obj.__index__() repetitions. -n

On Thu, Apr 7, 2016 at 11:44 AM, Nathaniel Smith <njs@pobox.com> wrote:
No, __index__ is the protocol for "do a safe coerce to integer". The name is misleading, but its use in non-indexing contexts is well established. E.g.
" ab" * obj
will return a string with obj.__index__() repetitions.
A good argument for Chris A's proposal over on python-ideas to have a dunder method for "coerce to a lossless string", that could be used for Path, but also for who knows what else? As I see it , exactly the same as the __fspath__ idea, except that we'd use a name that made it clear you might want to use it for other things (and str would grow that method...) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 04/06/2016 10:26 AM, Brett Cannon wrote:
WIth Ethan volunteering to do the work to help make a path protocol a thing -- and I'm willing to help along with propagating this through the stdlib where I think Serhiy might be interested in helping as well -- and a seeming consensus this is a good idea, it seems like this proposal has a chance of actually coming to fruition.
Excellent! Let's proceed along this path ;) until somebody objects.
Now we need clear details. :) Some open questions are:
1. Name: __path__, __fspath__, or something else?
__fspath__
2. Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
I would prefer an attribute, but yeah I think dunders are typically methods, and I don't see this being special enough to not follow that trend.
3. Built-in? (name is dependent on #1 if we add one)
fspath() -- and it would be handy to have a function that return either the __fspath__ results, or the string (if it was one), or raise an exception if neither of the above work out.
4. Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it)
I don't think that's needed. With Path() and fspath() it's trivial to make sure one has what one wants.
5. Expand the C API to have something like PyObject_Path()?
No opinion.
Some people have asked for the pathlib PEP to have a more flushed out reasoning as to why pathlib doesn't inherit from str. If Antoine doesn't want to do it I can try to instil my blog post into a more succinct paragraph or two and update the PEP myself.
Nice.
Is this going to require a PEP or if we can agree on the points here are we just going to do it? If we think it requires a PEP I'm willing to write it, but I obviously have no issue if we skip that step either. :)
If there are no (serious?) objects I don't think a PEP is needed.
Oh, and we should resolve this before the next release of Python 3.4, 3.5, or 3.6 so that pathlib can be updated in those releases.
Agreed. -- ~Ethan~

On Wed, 6 Apr 2016 at 11:06 Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/06/2016 10:26 AM, Brett Cannon wrote:
WIth Ethan volunteering to do the work to help make a path protocol a thing -- and I'm willing to help along with propagating this through the stdlib where I think Serhiy might be interested in helping as well -- and a seeming consensus this is a good idea, it seems like this proposal has a chance of actually coming to fruition.
Excellent! Let's proceed along this path ;) until somebody objects.
Now we need clear details. :) Some open questions are:
1. Name: __path__, __fspath__, or something else?
__fspath__
+1 for __path__, +0 for __fspath__ (I don't know how widespread the notion that "fs" means "file system" is).
2. Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
I would prefer an attribute, but yeah I think dunders are typically methods, and I don't see this being special enough to not follow that trend.
Depends on what we want to tell 3rd-party libraries to do to support pathlib if they are on 3.3 or if they are worried about people using Python 3.4.2 or 3.5.1. An attribute still works with `getattr(path, '__path__', path)`. But with a method you probably want either `path.__path__() if hasattr(path, '__path__') else path` or `getattr(path, '__path__', lambda: path)()`.
3. Built-in? (name is dependent on #1 if we add one)
fspath() -- and it would be handy to have a function that return either the __fspath__ results, or the string (if it was one), or raise an exception if neither of the above work out.
So: # Attribute def fspath(path): hasattr(path, '__path__'): return path.__path__ if isinstance(path, str): return path raise NotImplementedError # Or TypeError? # Method def fspath(path): try: return path.__path__() except AttributeError: if isinstance(path, str): return path raise TypeError # Or NotImplementedError? Or you can drop the isinstance() check and simply check for the attribute/method and use it and otherwise return `path` and let the code's duck-typing of str handle catching an unexpected type for a path. At which point the built-in becomes whatever idiom we promote for pathlib usage that pre-dates this protocol.
4. Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it)
I don't think that's needed. With Path() and fspath() it's trivial to make sure one has what one wants.
If we add str.__fspath__ then the function becomes: def fspath(path): return path.__fspath__() Which might be too simplistic for a built-in, but that also means adding it on str would potentially negate the need for a built-in.
5. Expand the C API to have something like PyObject_Path()?
No opinion.
If we add a built-in then I say we add an equivalent function in the C API. -Brett
Some people have asked for the pathlib PEP to have a more flushed out reasoning as to why pathlib doesn't inherit from str. If Antoine doesn't want to do it I can try to instil my blog post into a more succinct paragraph or two and update the PEP myself.
Nice.
Is this going to require a PEP or if we can agree on the points here are we just going to do it? If we think it requires a PEP I'm willing to write it, but I obviously have no issue if we skip that step either. :)
If there are no (serious?) objects I don't think a PEP is needed.
Oh, and we should resolve this before the next release of Python 3.4, 3.5, or 3.6 so that pathlib can be updated in those releases.
Agreed.
-- ~Ethan~
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

On 04/06/2016 11:32 AM, Brett Cannon wrote:
On Wed, 6 Apr 2016 at 11:06 Ethan Furman wrote:
On 04/06/2016 10:26 AM, Brett Cannon wrote:
Now we need clear details. :) Some open questions are:
1. Name: __path__, __fspath__, or something else?
__fspath__
+1 for __path__, +0 for __fspath__ (I don't know how widespread the notion that "fs" means "file system" is).
Maybe __os_path__ then? I would rather be explicit about the type of path we are dealing with -- who knows if we won't have __url_path__ in the future (besides Guido, of course ;)
def fspath(path): try: return path.__path__() except AttributeError: if isinstance(path, str): return path raise TypeError # Or NotImplementedError?
Or you can drop the isinstance() check and [...]
If the purpose of fspath() is to return a usable path-as-string then we should raise if unable to do it.
If we add str.__fspath__ then the function becomes:
def fspath(path): return path.__fspath__()
Which might be too simplistic for a built-in, but that also means adding it on str would potentially negate the need for a built-in.
That is an attractive option. -- ~Ethan~

On Thu, Apr 7, 2016 at 4:54 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
Maybe __os_path__ then? I would rather be explicit about the type of path we are dealing with -- who knows if we won't have __url_path__ in the future (besides Guido, of course ;)
Bikeshedding furiously... I don't like os_path here as it's too similar to os.path; unless that's deliberate? ChrisA

On 04/06/2016 12:18 PM, Chris Angelico wrote:
On Thu, Apr 7, 2016 at 4:54 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
Maybe __os_path__ then? I would rather be explicit about the type of path we are dealing with -- who knows if we won't have __url_path__ in the future (besides Guido, of course ;)
Bikeshedding furiously... I don't like os_path here as it's too similar to os.path; unless that's deliberate?
Well, it is a Operating System Path. ;) -- ~Ethan~

On Wed, Apr 06, 2016 at 11:54:08AM -0700, Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/06/2016 11:32 AM, Brett Cannon wrote:
On Wed, 6 Apr 2016 at 11:06 Ethan Furman wrote:
On 04/06/2016 10:26 AM, Brett Cannon wrote:
Now we need clear details. :) Some open questions are:
1. Name: __path__, __fspath__, or something else?
__fspath__
+1 for __path__, +0 for __fspath__ (I don't know how widespread the notion that "fs" means "file system" is).
Maybe __os_path__ then? I would rather be explicit about the type of path we are dealing with -- who knows if we won't have __url_path__ in the future (besides Guido, of course ;)
__pathstr__? __urlstr__? Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Wed, 6 Apr 2016 at 12:38 Oleg Broytman <phd@phdru.name> wrote:
On 04/06/2016 11:32 AM, Brett Cannon wrote:
On Wed, 6 Apr 2016 at 11:06 Ethan Furman wrote:
On 04/06/2016 10:26 AM, Brett Cannon wrote:
Now we need clear details. :) Some open questions are:
1. Name: __path__, __fspath__, or something else?
__fspath__
+1 for __path__, +0 for __fspath__ (I don't know how widespread the notion that "fs" means "file system" is).
Maybe __os_path__ then? I would rather be explicit about the type of
On Wed, Apr 06, 2016 at 11:54:08AM -0700, Ethan Furman <ethan@stoneleaf.us> wrote: path
we are dealing with -- who knows if we won't have __url_path__ in the future (besides Guido, of course ;)
__pathstr__? __urlstr__?
But we didn't call it __indexint__ either. No need to embed the type in the name. -Brett
Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

On Wed, Apr 6, 2016 at 2:32 PM, Brett Cannon <brett@python.org> wrote:
+1 for __path__, +0 for __fspath__ (I don't know how widespread the notion that "fs" means "file system" is).
Same here. In the good old days, "fs" stood for a "Font Server." And in even older (and better?) days, FS was a "Field Separator."

On 06.04.2016 21:02, Alexander Belopolsky wrote:
On Wed, Apr 6, 2016 at 2:32 PM, Brett Cannon <brett@python.org <mailto:brett@python.org>> wrote:
+1 for __path__, +0 for __fspath__Â (I don't know how widespread the notion that "fs" means "file system" is).
Same here. In the good old days, "fs" stood for a "Font Server."  And in even older (and better?) days, FS was a "Field Separator."
The future is not the past. ;) What about __file_path__ ? Best, Sven

On Wed, 6 Apr 2016 at 13:20 Sven R. Kunze <srkunze@mail.de> wrote:
On 06.04.2016 21:02, Alexander Belopolsky wrote:
On Wed, Apr 6, 2016 at 2:32 PM, Brett Cannon <brett@python.org> wrote:
+1 for __path__, +0 for __fspath__Â (I don't know how widespread the
notion that "fs" means "file system" is).
Same here. In the good old days, "fs" stood for a "Font Server."  And in even older (and better?) days, FS was a "Field Separator."
The future is not the past. ;)
What about
__file_path__
Can be a directory as well (and you could argue semantics of file system inodes, beginners won't know the subtlety and/or wonder where __dir_path__ is).

On 06.04.2016 22:28, Brett Cannon wrote:
On Wed, 6 Apr 2016 at 13:20 Sven R. Kunze <srkunze@mail.de <mailto:srkunze@mail.de>> wrote:
What about
__file_path__
Can be a directory as well (and you could argue semantics of file system inodes, beginners won't know the subtlety and/or wonder where __dir_path__ is).
Good point. Well, then __fspath__ for me. I knew instantly what it means especially considering btrfs, ntfs, xfs, zfs, etc. Furthermore, we MIGHT later want some URI support, so I don't know off the top of my head if there's a difference between __fspath__ and __urlpath__ but better separate it now. Later we can re-merge then if necessary. Best, Sven

On Wed, 6 Apr 2016 at 13:54 Sven R. Kunze <srkunze@mail.de> wrote:
On 06.04.2016 22:28, Brett Cannon wrote:
On Wed, 6 Apr 2016 at 13:20 Sven R. Kunze < <srkunze@mail.de> srkunze@mail.de> wrote:
What about
__file_path__
Can be a directory as well (and you could argue semantics of file system inodes, beginners won't know the subtlety and/or wonder where __dir_path__ is).
Good point.
Well, then __fspath__ for me.
I knew instantly what it means especially considering btrfs, ntfs, xfs, zfs, etc.
Furthermore, we MIGHT later want some URI support, so I don't know off the top of my head if there's a difference between __fspath__ and __urlpath__ but better separate it now. Later we can re-merge then if necessary.
There's a difference as a URL represents something different than a file system path (URI doesn't necessarily). Plus the serialized format would be different, etc.

On 06.04.2016 22:55, Brett Cannon wrote:
On Wed, 6 Apr 2016 at 13:54 Sven R. Kunze <srkunze@mail.de <mailto:srkunze@mail.de>> wrote:
Furthermore, we MIGHT later want some URI support, so I don't know off the top of my head if there's a difference between __fspath__ and __urlpath__ but better separate it now. Later we can re-merge then if necessary.
There's a difference as a URL represents something different than a file system path (URI doesn't necessarily). Plus the serialized format would be different, etc.
Sure. URLs and URIs are more than just paths. I would expect __urlpath__ to be different than __url__ itself but if that's is a different discussion. So, __fspath__ for me. :) Best, Sven

On 6 April 2016 at 19:32, Brett Cannon <brett@python.org> wrote:
Now we need clear details. :) Some open questions are:
1. Name: __path__, __fspath__, or something else?
__fspath__
+1 for __path__, +0 for __fspath__ (I don't know how widespread the notion that "fs" means "file system" is).
Agreed. But if we have a builtin, it should follow the name of the special attribute/method. And I'm not that keen on having a builtin with a generic name like 'path'.
2. Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
I would prefer an attribute, but yeah I think dunders are typically methods, and I don't see this being special enough to not follow that trend.
Depends on what we want to tell 3rd-party libraries to do to support pathlib if they are on 3.3 or if they are worried about people using Python 3.4.2 or 3.5.1. An attribute still works with `getattr(path, '__path__', path)`. But with a method you probably want either `path.__path__() if hasattr(path, '__path__') else path` or `getattr(path, '__path__', lambda: path)()`.
I'm a little confused by this. To support the older pathlib, they have to do patharg = str(patharg), because *none* of the proposed attributes (path or __path__) will exist. The getattr trick is needed to support the *new* pathlib, when you need a real string. Currently you need a string if you call stdlib functions or builtins. If we fix the stdlib/builtins, the need goes away for those cases, but remains if you need to call libraries that *don't* support pathlib (os.path will likely be one of those) or do direct string manipulation. In practice, I see the getattr trick as an "easy fix" for libraries that want to add support but in a minimally-intrusive way. On that basis, making the trick easy to use is important, which argues for an attribute.
3. Built-in? (name is dependent on #1 if we add one)
fspath() -- and it would be handy to have a function that return either the __fspath__ results, or the string (if it was one), or raise an exception if neither of the above work out.
fspath regardless of the name chosen in #1 - a new builtin called path just has too much likelihood of clashing with user code. But I'm not sure we need a builtin. I'm not at all clear how frequently we expect user code to need to use this protocol. Users can't use the builtin if they want to be backward compatible, But code that doesn't need backward compatibility can probably just work with pathlib (and the stdlib support for it) directly. For display, the implicit conversion to str is fine. For "get me a string representing the path", is the "path" attribute being abandoned in favour of this special method? I'm inclined to think that if you are writing "pure pathlib" code, pathobj.path looks more readable than fspath(pathobj) - certainly no *less* readable. But I'm not one of the people who disliked using .path, so I'm probably not best placed to judge. It would be good if someone who *does* feel strongly could explain why fspath(pathobj) is better than pathobj.path.
So:
# Attribute def fspath(path): hasattr(path, '__path__'): return path.__path__ if isinstance(path, str): return path raise NotImplementedError # Or TypeError?
# Method def fspath(path): try: return path.__path__() except AttributeError: if isinstance(path, str): return path raise TypeError # Or NotImplementedError?
You could of course use try/except for the attribute case. Or hasattr for the method case (where it would avoid masking AttributeError exceptions raised within the dunder method call (a possibility if user classes implement their own version of the protocol). Paul

On Wed, 6 Apr 2016 at 12:32 Paul Moore <p.f.moore@gmail.com> wrote:
On 6 April 2016 at 19:32, Brett Cannon <brett@python.org> wrote:
Now we need clear details. :) Some open questions are:
1. Name: __path__, __fspath__, or something else?
__fspath__
+1 for __path__, +0 for __fspath__ (I don't know how widespread the notion that "fs" means "file system" is).
Agreed. But if we have a builtin, it should follow the name of the special attribute/method. And I'm not that keen on having a builtin with a generic name like 'path'.
2. Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
I would prefer an attribute, but yeah I think dunders are typically methods, and I don't see this being special enough to not follow that trend.
Depends on what we want to tell 3rd-party libraries to do to support pathlib if they are on 3.3 or if they are worried about people using Python 3.4.2 or 3.5.1. An attribute still works with `getattr(path, '__path__', path)`. But with a method you probably want either `path.__path__() if hasattr(path, '__path__') else path` or `getattr(path, '__path__', lambda: path)()`.
I'm a little confused by this. To support the older pathlib, they have to do patharg = str(patharg), because *none* of the proposed attributes (path or __path__) will exist.
The getattr trick is needed to support the *new* pathlib, when you need a real string. Currently you need a string if you call stdlib functions or builtins. If we fix the stdlib/builtins, the need goes away for those cases, but remains if you need to call libraries that *don't* support pathlib (os.path will likely be one of those) or do direct string manipulation.
In practice, I see the getattr trick as an "easy fix" for libraries that want to add support but in a minimally-intrusive way. On that basis, making the trick easy to use is important, which argues for an attribute.
So then where's the confusion? :) You seem to get the points. I personally find `path.__path__() if hasattr(path, '__path__') else path` also readable (if obviously a bit longer). -Brett
3. Built-in? (name is dependent on #1 if we add one)
fspath() -- and it would be handy to have a function that return either the __fspath__ results, or the string (if it was one), or raise an exception if neither of the above work out.
fspath regardless of the name chosen in #1 - a new builtin called path just has too much likelihood of clashing with user code.
But I'm not sure we need a builtin. I'm not at all clear how frequently we expect user code to need to use this protocol. Users can't use the builtin if they want to be backward compatible, But code that doesn't need backward compatibility can probably just work with pathlib (and the stdlib support for it) directly. For display, the implicit conversion to str is fine. For "get me a string representing the path", is the "path" attribute being abandoned in favour of this special method?
Yes.
I'm inclined to think that if you are writing "pure pathlib" code, pathobj.path looks more readable than fspath(pathobj) - certainly no *less* readable.
I don't' know what you mean by "pure pathlib". You mean code that only works with pathlib objects? Or do you mean code that accepts pathlib objects but uses strings internally? -Brett
But I'm not one of the people who disliked using .path, so I'm probably not best placed to judge. It would be good if someone who *does* feel strongly could explain why fspath(pathobj) is better than pathobj.path.
So:
# Attribute def fspath(path): hasattr(path, '__path__'): return path.__path__ if isinstance(path, str): return path raise NotImplementedError # Or TypeError?
# Method def fspath(path): try: return path.__path__() except AttributeError: if isinstance(path, str): return path raise TypeError # Or NotImplementedError?
You could of course use try/except for the attribute case. Or hasattr for the method case (where it would avoid masking AttributeError exceptions raised within the dunder method call (a possibility if user classes implement their own version of the protocol).
Paul

On 6 April 2016 at 20:39, Brett Cannon <brett@python.org> wrote:
I'm a little confused by this. To support the older pathlib, they have to do patharg = str(patharg), because *none* of the proposed attributes (path or __path__) will exist.
The getattr trick is needed to support the *new* pathlib, when you need a real string. Currently you need a string if you call stdlib functions or builtins. If we fix the stdlib/builtins, the need goes away for those cases, but remains if you need to call libraries that *don't* support pathlib (os.path will likely be one of those) or do direct string manipulation.
In practice, I see the getattr trick as an "easy fix" for libraries that want to add support but in a minimally-intrusive way. On that basis, making the trick easy to use is important, which argues for an attribute.
So then where's the confusion? :) You seem to get the points. I personally find `path.__path__() if hasattr(path, '__path__') else path` also readable (if obviously a bit longer).
The confusion is that you seem to be saying that people can use getattr(path, '__path__', path) to support older versions of Python. But the older versions are precisely the ones that don't have __path__ so you won't be supporting them.
3. Built-in? (name is dependent on #1 if we add one)
fspath() -- and it would be handy to have a function that return either the __fspath__ results, or the string (if it was one), or raise an exception if neither of the above work out.
fspath regardless of the name chosen in #1 - a new builtin called path just has too much likelihood of clashing with user code.
But I'm not sure we need a builtin. I'm not at all clear how frequently we expect user code to need to use this protocol. Users can't use the builtin if they want to be backward compatible, But code that doesn't need backward compatibility can probably just work with pathlib (and the stdlib support for it) directly. For display, the implicit conversion to str is fine. For "get me a string representing the path", is the "path" attribute being abandoned in favour of this special method?
Yes.
OK. So the idiom to get a string from a known Path object would be any of: 1. str(path) 2. fspath(path) 3. path.__path__() (1) is safe if you know you have a Path object, but could incorrectly convert non-Path objects. (2) is safe in all cases. (3) is ugly. Did I miss any options? So I think we need a builtin. Code that needs to be backward compatible will still have to use str(path), because neither the builtin nor the __path__ protocol will exist in older versions of Python. Maybe a compatibility library could add try: fspath except NameError: try: import pathlib def fspath(p): if isinstance(p, pathlib.Path): return str(p) return p except ImportError: def fspath(p): return p It's messy, like all compatibility code, but it allows code to use fspath(p) in older versions.
I'm inclined to think that if you are writing "pure pathlib" code, pathobj.path looks more readable than fspath(pathobj) - certainly no *less* readable.
I don't' know what you mean by "pure pathlib". You mean code that only works with pathlib objects? Or do you mean code that accepts pathlib objects but uses strings internally?
I mean code that knows it has a Path object to work with (and not a string or anything else). But the point is moot if the path attribute is going away. Other than to say that I do prefer the name "path", I just don't think it's a reasonable name for a builtin. Even if it's OK for user variables to have the same name as builtins, IDEs tend to colour builtins differently, which is distracting. (Temporary variables named "file" or "dir" are the ones I hit frequently...) If all we're debating is the name, though, I think we're pretty much there :-) Paul

On Wed, 6 Apr 2016 at 15:22 Paul Moore <p.f.moore@gmail.com> wrote:
I'm a little confused by this. To support the older pathlib, they have to do patharg = str(patharg), because *none* of the proposed attributes (path or __path__) will exist.
The getattr trick is needed to support the *new* pathlib, when you need a real string. Currently you need a string if you call stdlib functions or builtins. If we fix the stdlib/builtins, the need goes away for those cases, but remains if you need to call libraries that *don't* support pathlib (os.path will likely be one of those) or do direct string manipulation.
In practice, I see the getattr trick as an "easy fix" for libraries that want to add support but in a minimally-intrusive way. On that basis, making the trick easy to use is important, which argues for an attribute.
So then where's the confusion? :) You seem to get the points. I
On 6 April 2016 at 20:39, Brett Cannon <brett@python.org> wrote: personally
find `path.__path__() if hasattr(path, '__path__') else path` also readable (if obviously a bit longer).
The confusion is that you seem to be saying that people can use getattr(path, '__path__', path) to support older versions of Python. But the older versions are precisely the ones that don't have __path__ so you won't be supporting them.
Because pathlib is provisional the change will go into the next releases of Python 3.4, 3.5, and in 3.6 so new-old will have whatever we do. :) I think the key point is that this sort of thing will occur before you have access to some new built-in or something.
3. Built-in? (name is dependent on #1 if we add one)
fspath() -- and it would be handy to have a function that return either the __fspath__ results, or the string (if it was one), or raise an exception if neither of the above work out.
fspath regardless of the name chosen in #1 - a new builtin called path just has too much likelihood of clashing with user code.
But I'm not sure we need a builtin. I'm not at all clear how frequently we expect user code to need to use this protocol. Users can't use the builtin if they want to be backward compatible, But code that doesn't need backward compatibility can probably just work with pathlib (and the stdlib support for it) directly. For display, the implicit conversion to str is fine. For "get me a string representing the path", is the "path" attribute being abandoned in favour of this special method?
Yes.
OK. So the idiom to get a string from a known Path object would be any of:
1. str(path) 2. fspath(path) 3. path.__path__()
(1) is safe if you know you have a Path object, but could incorrectly convert non-Path objects. (2) is safe in all cases. (3) is ugly. Did I miss any options?
Other than path.__path__ being an attribute, nope.
So I think we need a builtin.
Well, the ugliness shouldn't survive forever if the community shifts over to using pathlib while the built-in will. We also don't have a built-in for __index__() so it depends on whether we expect this sort of thing to be the purview of library authors or if normal people will be interacting with it (it's probably both during the transition, but I don't know afterwards).
Code that needs to be backward compatible will still have to use str(path), because neither the builtin nor the __path__ protocol will exist in older versions of Python.
str(path) will definitely work, path.__path__ will work if you're running the next set of bugfix releases. fspath(path) will only work in Python 3.6 and newer.
Maybe a compatibility library could add
try: fspath except NameError: try: import pathlib def fspath(p): if isinstance(p, pathlib.Path): return str(p) return p except ImportError: def fspath(p): return p
It's messy, like all compatibility code, but it allows code to use fspath(p) in older versions.
I would tweak it to check for __fspath__ before it resorted to calling str(), but yes, that could be something people use.
I'm inclined to think that if you are writing "pure pathlib" code, pathobj.path looks more readable than fspath(pathobj) - certainly no *less* readable.
I don't' know what you mean by "pure pathlib". You mean code that only works with pathlib objects? Or do you mean code that accepts pathlib objects but uses strings internally?
I mean code that knows it has a Path object to work with (and not a string or anything else). But the point is moot if the path attribute is going away.
Other than to say that I do prefer the name "path", I just don't think it's a reasonable name for a builtin. Even if it's OK for user variables to have the same name as builtins, IDEs tend to colour builtins differently, which is distracting. (Temporary variables named "file" or "dir" are the ones I hit frequently...)
If all we're debating is the name, though, I think we're pretty much there :-)
It seems like __fspath__ may be leading as a name, but not that many people have spoken up. But that is not the only thing still up for debate. :) We have not settled on whether a built-in is necessary. Maybe whatever function we come with should live in pathlib itself and not have it be a built-in? We have also not settled on whether __fspath__ should be a method or attribute as that changes the boilerplate one-liner people may use if a built-in isn't available. This is the first half of the protocol. What exactly should this helper function do? E.g. does it simply return its argument if __fspath__ isn't defined, or does it check for __fspath__, then if it's an instance of str, then TypeError? This is the second half of the protocol and will end up defining what a "path-like object" represents.

Note: While I do not object to the bike shed colors being proposed, if you call the attribute .__path__ that is somewhat confusing when thinking about the import system which declares that *"any module that contains a __path__ attribute is considered a package"*. So would module.__path__ become a Path instance in a potential future making module.__path__.__path__ meaningfully confusing? ;) I'm not worried about people who shove pathlib.Path instances in as values into sys.modules and expect anything but pain. :P __gps__ On Wed, Apr 6, 2016 at 3:46 PM Brett Cannon <brett@python.org> wrote:
On Wed, 6 Apr 2016 at 15:22 Paul Moore <p.f.moore@gmail.com> wrote:
I'm a little confused by this. To support the older pathlib, they have to do patharg = str(patharg), because *none* of the proposed attributes (path or __path__) will exist.
The getattr trick is needed to support the *new* pathlib, when you need a real string. Currently you need a string if you call stdlib functions or builtins. If we fix the stdlib/builtins, the need goes away for those cases, but remains if you need to call libraries that *don't* support pathlib (os.path will likely be one of those) or do direct string manipulation.
In practice, I see the getattr trick as an "easy fix" for libraries that want to add support but in a minimally-intrusive way. On that basis, making the trick easy to use is important, which argues for an attribute.
So then where's the confusion? :) You seem to get the points. I
On 6 April 2016 at 20:39, Brett Cannon <brett@python.org> wrote: personally
find `path.__path__() if hasattr(path, '__path__') else path` also readable (if obviously a bit longer).
The confusion is that you seem to be saying that people can use getattr(path, '__path__', path) to support older versions of Python. But the older versions are precisely the ones that don't have __path__ so you won't be supporting them.
Because pathlib is provisional the change will go into the next releases of Python 3.4, 3.5, and in 3.6 so new-old will have whatever we do. :) I think the key point is that this sort of thing will occur before you have access to some new built-in or something.
> 3. Built-in? (name is dependent on #1 if we add one)
fspath() -- and it would be handy to have a function that return either the __fspath__ results, or the string (if it was one), or raise an exception if neither of the above work out.
fspath regardless of the name chosen in #1 - a new builtin called path just has too much likelihood of clashing with user code.
But I'm not sure we need a builtin. I'm not at all clear how frequently we expect user code to need to use this protocol. Users can't use the builtin if they want to be backward compatible, But code that doesn't need backward compatibility can probably just work with pathlib (and the stdlib support for it) directly. For display, the implicit conversion to str is fine. For "get me a string representing the path", is the "path" attribute being abandoned in favour of this special method?
Yes.
OK. So the idiom to get a string from a known Path object would be any of:
1. str(path) 2. fspath(path) 3. path.__path__()
(1) is safe if you know you have a Path object, but could incorrectly convert non-Path objects. (2) is safe in all cases. (3) is ugly. Did I miss any options?
Other than path.__path__ being an attribute, nope.
So I think we need a builtin.
Well, the ugliness shouldn't survive forever if the community shifts over to using pathlib while the built-in will. We also don't have a built-in for __index__() so it depends on whether we expect this sort of thing to be the purview of library authors or if normal people will be interacting with it (it's probably both during the transition, but I don't know afterwards).
Code that needs to be backward compatible will still have to use str(path), because neither the builtin nor the __path__ protocol will exist in older versions of Python.
str(path) will definitely work, path.__path__ will work if you're running the next set of bugfix releases. fspath(path) will only work in Python 3.6 and newer.
Maybe a compatibility library could add
try: fspath except NameError: try: import pathlib def fspath(p): if isinstance(p, pathlib.Path): return str(p) return p except ImportError: def fspath(p): return p
It's messy, like all compatibility code, but it allows code to use fspath(p) in older versions.
I would tweak it to check for __fspath__ before it resorted to calling str(), but yes, that could be something people use.
I'm inclined to think that if you are writing "pure pathlib" code, pathobj.path looks more readable than fspath(pathobj) - certainly no *less* readable.
I don't' know what you mean by "pure pathlib". You mean code that only works with pathlib objects? Or do you mean code that accepts pathlib objects but uses strings internally?
I mean code that knows it has a Path object to work with (and not a string or anything else). But the point is moot if the path attribute is going away.
Other than to say that I do prefer the name "path", I just don't think it's a reasonable name for a builtin. Even if it's OK for user variables to have the same name as builtins, IDEs tend to colour builtins differently, which is distracting. (Temporary variables named "file" or "dir" are the ones I hit frequently...)
If all we're debating is the name, though, I think we're pretty much there :-)
It seems like __fspath__ may be leading as a name, but not that many people have spoken up. But that is not the only thing still up for debate. :)
We have not settled on whether a built-in is necessary. Maybe whatever function we come with should live in pathlib itself and not have it be a built-in?
We have also not settled on whether __fspath__ should be a method or attribute as that changes the boilerplate one-liner people may use if a built-in isn't available. This is the first half of the protocol.
What exactly should this helper function do? E.g. does it simply return its argument if __fspath__ isn't defined, or does it check for __fspath__, then if it's an instance of str, then TypeError? This is the second half of the protocol and will end up defining what a "path-like object" represents. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/greg%40krypto.org

On Wed, Apr 6, 2016 at 3:54 PM, Gregory P. Smith <greg@krypto.org> wrote:
Note: While I do not object to the bike shed colors being proposed, if you call the attribute .__path__ that is somewhat confusing when thinking about the import system which declares that "any module that contains a __path__ attribute is considered a package".
To me this observation seems to rule out __path__ as an option: even if they wouldn't clash in practice, then right now googling __path__ sends you straight to the import system documentation. If we overload the meaning of the string then it'll make a mess of the trying-to-figure-out-what-this-__thing__-is experience. -n -- Nathaniel J. Smith -- https://vorpus.org

On Wed, 6 Apr 2016 at 15:54 Gregory P. Smith <greg@krypto.org> wrote:
Note: While I do not object to the bike shed colors being proposed, if you call the attribute .__path__ that is somewhat confusing when thinking about the import system which declares that *"any module that contains a __path__ attribute is considered a package"*.
So would module.__path__ become a Path instance in a potential future making module.__path__.__path__ meaningfully confusing? ;)
I'm not worried about people who shove pathlib.Path instances in as values into sys.modules and expect anything but pain. :P
Ah, good point. I think that kills __path__ then as an option. -Brett
__gps__
On Wed, Apr 6, 2016 at 3:46 PM Brett Cannon <brett@python.org> wrote:
On Wed, 6 Apr 2016 at 15:22 Paul Moore <p.f.moore@gmail.com> wrote:
I'm a little confused by this. To support the older pathlib, they have to do patharg = str(patharg), because *none* of the proposed attributes (path or __path__) will exist.
The getattr trick is needed to support the *new* pathlib, when you need a real string. Currently you need a string if you call stdlib functions or builtins. If we fix the stdlib/builtins, the need goes away for those cases, but remains if you need to call libraries that *don't* support pathlib (os.path will likely be one of those) or do direct string manipulation.
In practice, I see the getattr trick as an "easy fix" for libraries that want to add support but in a minimally-intrusive way. On that basis, making the trick easy to use is important, which argues for an attribute.
So then where's the confusion? :) You seem to get the points. I
On 6 April 2016 at 20:39, Brett Cannon <brett@python.org> wrote: personally
find `path.__path__() if hasattr(path, '__path__') else path` also readable (if obviously a bit longer).
The confusion is that you seem to be saying that people can use getattr(path, '__path__', path) to support older versions of Python. But the older versions are precisely the ones that don't have __path__ so you won't be supporting them.
Because pathlib is provisional the change will go into the next releases of Python 3.4, 3.5, and in 3.6 so new-old will have whatever we do. :) I think the key point is that this sort of thing will occur before you have access to some new built-in or something.
> > 3. Built-in? (name is dependent on #1 if we add one) > > fspath() -- and it would be handy to have a function that return either > the __fspath__ results, or the string (if it was one), or raise an > exception if neither of the above work out.
fspath regardless of the name chosen in #1 - a new builtin called path just has too much likelihood of clashing with user code.
But I'm not sure we need a builtin. I'm not at all clear how frequently we expect user code to need to use this protocol. Users can't use the builtin if they want to be backward compatible, But code that doesn't need backward compatibility can probably just work with pathlib (and the stdlib support for it) directly. For display, the implicit conversion to str is fine. For "get me a string representing the path", is the "path" attribute being abandoned in favour of this special method?
Yes.
OK. So the idiom to get a string from a known Path object would be any of:
1. str(path) 2. fspath(path) 3. path.__path__()
(1) is safe if you know you have a Path object, but could incorrectly convert non-Path objects. (2) is safe in all cases. (3) is ugly. Did I miss any options?
Other than path.__path__ being an attribute, nope.
So I think we need a builtin.
Well, the ugliness shouldn't survive forever if the community shifts over to using pathlib while the built-in will. We also don't have a built-in for __index__() so it depends on whether we expect this sort of thing to be the purview of library authors or if normal people will be interacting with it (it's probably both during the transition, but I don't know afterwards).
Code that needs to be backward compatible will still have to use str(path), because neither the builtin nor the __path__ protocol will exist in older versions of Python.
str(path) will definitely work, path.__path__ will work if you're running the next set of bugfix releases. fspath(path) will only work in Python 3.6 and newer.
Maybe a compatibility library could add
try: fspath except NameError: try: import pathlib def fspath(p): if isinstance(p, pathlib.Path): return str(p) return p except ImportError: def fspath(p): return p
It's messy, like all compatibility code, but it allows code to use fspath(p) in older versions.
I would tweak it to check for __fspath__ before it resorted to calling str(), but yes, that could be something people use.
I'm inclined to think that if you are writing "pure pathlib" code, pathobj.path looks more readable than fspath(pathobj) - certainly no *less* readable.
I don't' know what you mean by "pure pathlib". You mean code that only works with pathlib objects? Or do you mean code that accepts pathlib objects but uses strings internally?
I mean code that knows it has a Path object to work with (and not a string or anything else). But the point is moot if the path attribute is going away.
Other than to say that I do prefer the name "path", I just don't think it's a reasonable name for a builtin. Even if it's OK for user variables to have the same name as builtins, IDEs tend to colour builtins differently, which is distracting. (Temporary variables named "file" or "dir" are the ones I hit frequently...)
If all we're debating is the name, though, I think we're pretty much there :-)
It seems like __fspath__ may be leading as a name, but not that many people have spoken up. But that is not the only thing still up for debate. :)
We have not settled on whether a built-in is necessary. Maybe whatever function we come with should live in pathlib itself and not have it be a built-in?
We have also not settled on whether __fspath__ should be a method or attribute as that changes the boilerplate one-liner people may use if a built-in isn't available. This is the first half of the protocol.
What exactly should this helper function do? E.g. does it simply return its argument if __fspath__ isn't defined, or does it check for __fspath__, then if it's an instance of str, then TypeError? This is the second half of the protocol and will end up defining what a "path-like object" represents.
_______________________________________________
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/greg%40krypto.org

On 04/06/2016 04:27 PM, Brett Cannon wrote:
On Wed, 6 Apr 2016 at 15:54 Gregory P. Smithwrote:
So would module.__path__ become a Path instance in a potential future making module.__path__.__path__ meaningfully confusing? ;)
I'm not worried about people who shove pathlib.Path instances in as values into sys.modules and expect anything but pain. :P
Ah, good point. I think that kills __path__ then as an option.
Excellent! Narrowing the field then to: __fspath__ __os_path__ Step right up! Cast yer votes! -- ~Ethan~

On 4/6/2016 4:44 PM, Ethan Furman wrote:
On 04/06/2016 04:27 PM, Brett Cannon wrote:
On Wed, 6 Apr 2016 at 15:54 Gregory P. Smithwrote:
So would module.__path__ become a Path instance in a potential future making module.__path__.__path__ meaningfully confusing? ;)
I'm not worried about people who shove pathlib.Path instances in as values into sys.modules and expect anything but pain. :P
Ah, good point. I think that kills __path__ then as an option.
Excellent! Narrowing the field then to:
__fspath__
-1: not all os names that look like files actually refer to the file system: pipes, devices, etc.
__os_path__
+1: the special names are os dependent, so os seems like an appropriate prefix.
Step right up! Cast yer votes!
-- ~Ethan~
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/v%2Bpython%40g.nevcal.com

Ah, good point. I think that kills __path__ then as an option.
Darn. I really preferred that. Oh well.
__fspath__
+0.1 But not a big deal. I think this is pretty much for occasional use by library authors, so not a big deal what it is named. Which also means that I don't think we need a built-in function that calls it, either. How often do people need a stringified-path version of an arbitrary object? Which makes me think: str() calls __str__ on an arbitrary object, and creates a new string object. But fspath(), if it exists, would call __fspath__ on an arbitrary object, and create a new string -- not a new Path. That may be confusing... If we were starting from scratch, I suppose __path__ would return a Path object -- it would be a protocol one could use to duck-type a path. But since we have history, we are creating a protocol that conforms to the existing string-as-path protocol. So are we imagining that future libs will be written that only take objects with a __fspath__ method? In which case, do we need to add it to str? In which case, this is all kind of pointless. Or maybe all future libs will continue to accept either an str or an object with __fspath__. In which case, this is pretty pointless, too. I guess what I'm wondering is if we are stuck with str-paths as the lingua-Franca for paths forever. In which case, we should embrace that and just call str() on anything passed in as a path argument. Sure, then open(3.5) will give you a file not found error, or maybe create a file with a weird name, but really? Who's going to make that mistake and not figure it out really quickly? -CHB

On 04/06/2016 05:43 PM, Chris Barker - NOAA Federal wrote:
__fspath__
+0.1
But not a big deal. I think this is pretty much for occasional use by library authors, so not a big deal what it is named.
It's mostly for the stdlib itself. I imagine that most libraries would just take what they are given and pass it along to open or os.stat or whatever.
Which also means that I don't think we need a built-in function that calls it, either. How often do people need a stringified-path version of an arbitrary object?
Not often.
Which makes me think: str() calls __str__ on an arbitrary object, and creates a new string object.
But fspath(), if it exists, would call __fspath__ on an arbitrary object, and create a new string -- not a new Path. That may be confusing...
It would be more along the lines of pickle -- give me the standard serialized form of this Path, please.
If we were starting from scratch, I suppose __path__ would return a Path object -- it would be a protocol one could use to duck-type a path.
Sure.
But since we have history, we are creating a protocol that conforms to the existing string-as-path protocol.
Yup.
So are we imagining that future libs will be written that only take objects with a __fspath__ method? In which case, do we need to add it to str? In which case, this is all kind of pointless.
We are imagining that future libraries that have to muck about with paths will work with Path objects, either by accepting them or converting to them as the (possibly) stringified paths are passed in -- and when necessary those libs can pass either the Path obj or the stringified path to the stdlib.
Or maybe all future libs will continue to accept either an str or an object with __fspath__. In which case, this is pretty pointless, too.
The point is to allow future programs to work with Path and be able to work with the stdlib as seamlessly and painlessly as possible.
I guess what I'm wondering is if we are stuck with str-paths as the lingua-Franca for paths forever. In which case, we should embrace that and just call str() on anything passed in as a path argument.
Nah. That's inviting trouble and pain, and we're trying to get away from that.
Sure, then open(3.5) will give you a file not found error, or maybe create a file with a weird name, but really? Who's going to make that mistake and not figure it out really quickly?
Well, since the 3.5 was actually in my_var, and could have been written before it was read, it could easily be days, weeks, or even months -- probably after the last guy quit, you took the job, the server died, and you had to restore from backup -- at which point you'll see all the really, really strange file names and wonder what they are. And of course, whatever logic was determining those weird names is now out of sync because of the server swap. And, yeah, I've seen weirder things happen. -- ~Ethan~

On Wed, Apr 6, 2016 at 5:57 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
But not a big deal. I think this is pretty much for occasional use by
library authors, so not a big deal what it is named.
It's mostly for the stdlib itself. I imagine that most libraries would just take what they are given and pass it along to open or os.stat or whatever.
Exactly -- so we really don't need a builtin shortcut.
Which makes me think: str() calls __str__ on an arbitrary object, and
creates a new string object.
But fspath(), if it exists, would call __fspath__ on an arbitrary object, and create a new string -- not a new Path. That may be confusing...
It would be more along the lines of pickle -- give me the standard serialized form of this Path, please.
well, give me the standard serialized-path of this arbitrary object, yes?
So are we imagining that future libs will be written that only take
objects with a __fspath__ method? In which case, do we need to add it to str? In which case, this is all kind of pointless.
We are imagining that future libraries that have to muck about with paths will work with Path objects, either by accepting them or converting to them as the (possibly) stringified paths are passed in -- and when necessary those libs can pass either the Path obj or the stringified path to the stdlib.
if that's the case, we don't need the __fspath__ protocol -- the reason for the protocol is that we imagine there may be any number of third-party objects to represent/work-with paths, that aren't strings or stdlib Path objects. Or maybe all future libs will continue to accept either an str or an
object with __fspath__. In which case, this is pretty pointless, too.
The point is to allow future programs to work with Path and be able to work with the stdlib as seamlessly and painlessly as possible.
again, we don't need a new protocol for that -- we only need the protocol if we want arbitrary future programs to work with arbitrary path implementations. which I suppose we do -- there are already other path implimentaitons out there (though at least some are strings :-) )
I guess what I'm wondering is if we are stuck with str-paths as the
lingua-Franca for paths forever. In which case, we should embrace that and just call str() on anything passed in as a path argument.
Nah. That's inviting trouble and pain, and we're trying to get away from that.
Sure, then open(3.5) will give you a file not found error, or maybe
create a file with a weird name, but really? Who's going to make that mistake and not figure it out really quickly?
Well, since the 3.5 was actually in my_var, and could have been written before it was read, it could easily be days, weeks, or even months -- probably after the last guy quit, you took the job, the server died, and you had to restore from backup -- at which point you'll see all the really, really strange file names and wonder what they are. And of course, whatever logic was determining those weird names is now out of sync because of the server swap.
And, yeah, I've seen weirder things happen.
People can totally screw up path variables as strings or Path objects too -- I'm having trouble seeing that this is all that more likely -- after all, python is a dynamic language -- if we wanted full type safety, we wouldn't be using python... Speaking of which, how is this going to work with the new type system? Do we need an ABC, rather than just a protocol? But as long as we get to the stdlib taking Path objects, I'm happy :-) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 04/06/2016 08:50 PM, Chris Barker wrote:
On Wed, Apr 6, 2016 at 5:57 PM, Ethan Furman wrote:
It's mostly for the stdlib itself. I imagine that most libraries would just take what they are given and pass it along to open or os.stat or whatever.
Exactly -- so we really don't need a builtin shortcut.
Hey, we have to program the stdlib too! No need to make it harder for ourselves.
It would be more along the lines of pickle -- give me the standard serialized form of this Path, please.
well, give me the standard serialized-path of this arbitrary object, yes?
Yes. :)
We are imagining that future libraries that have to muck about with paths will work with Path objects, either by accepting them or converting to them as the (possibly) stringified paths are passed in -- and when necessary those libs can pass either the Path obj or the stringified path to the stdlib.
if that's the case, we don't need the __fspath__ protocol -- then reason for the protocol is that we imagine there may be any number of third-party objects to represent/work-with paths, that aren't strings or stdlib Path objects.
The purpose of the __os_path__ method is two-fold: - it's presence declares that the object is a path (or convertible to one) - it does the conversion Since we need it for ourselves there's no reason to prevent others from taking advantage of it.
The point is to allow future programs to work with Path and be able to work with the stdlib as seamlessly and painlessly as possible.
again, we don't need a new protocol for that -- we only need the protocol if we want arbitrary future programs to work with arbitrary path implementations.
I am certainly not opposed to that.
which I suppose we do -- there are already other path implimentaitons out there (though at least some are strings :-) )
Right. And I'm already making changes to mine to work with this new stuff.
People can totally screw up path variables as strings or Path objects too -- I'm having trouble seeing that this is all that more likely -- after all, python is a dynamic language -- if we wanted full type safety, we wouldn't be using python...
Very True. ;)
Speaking of which, how is this going to work with the new type system? Do we need an ABC, rather than just a protocol?
I do not know, good question.
But as long as we get to the stdlib taking Path objects, I'm happy :-)
Excellent! -- ~Ethan~

Chris Barker writes:
which I suppose we do -- there are already other path implimentaitons out there (though at least some are strings :-) )
Even so, their __fspath__ implementation might return syntactically canonicalized or realpath paths, rather than whatever is input. If cached and the path frequently accessed, the realpath implementation might be a significant win in some applications.

Chris Barker - NOAA Federal wrote:
But fspath(), if it exists, would call __fspath__ on an arbitrary object, and create a new string -- not a new Path. That may be confusing...
Maybe something like fspathstr/__fspathstr__ would be better? -- Greg

On 04/06/2016 11:15 PM, Greg Ewing wrote:
Chris Barker - NOAA Federal wrote:
But fspath(), if it exists, would call __fspath__ on an arbitrary object, and create a new string -- not a new Path. That may be confusing...
Maybe something like fspathstr/__fspathstr__ would be better?
As someone already said, we don't need to embed the type in the name. The point of the __os_path__ protocol is to return the serialized version of the Path the object represents. This would be somewhat similar to the various __reduce*__ protocols (which I thought had something to do with adding until I learned what they were for). -- ~Ethan~

On Apr 06 2016, Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/06/2016 11:15 PM, Greg Ewing wrote:
Chris Barker - NOAA Federal wrote:
But fspath(), if it exists, would call __fspath__ on an arbitrary object, and create a new string -- not a new Path. That may be confusing...
Maybe something like fspathstr/__fspathstr__ would be better?
As someone already said, we don't need to embed the type in the name.
The point of the __os_path__ protocol is to return the serialized version of the Path the object represents. This would be somewhat similar to the various __reduce*__ protocols (which I thought had something to do with adding until I learned what they were for).
Does anyone anticipate any classes other than those from pathlib to come with such a method? It seems odd to me to introduce a special method (and potentially a buildin too) if it's only going to be used by a single module. Why is: path = getattr(obj, '__fspath__') if hasattr(obj, '__fspath__') else obj better than path = str(obj) if isinstance(obj, pathlib.Path) else obj ? Yes, I know there are other pathlib-like modules out there. But isn't pathlib meant to replace them? Best, Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

On Apr 7, 2016, at 6:48 AM, Nikolaus Rath <Nikolaus@rath.org> wrote:
Does anyone anticipate any classes other than those from pathlib to come with such a method?
It seems like it would be reasonable for pathlib.Path to call fspath on the path passed to pathlib.Path.__init__, which would mean that if other libraries implemented __fspath__ then you could pass their path objects to pathlib and it would just work (and similarly, if they also called fspath it would enable interoperation between all of the various path libraries). ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Thu, Apr 7, 2016 at 4:03 AM, Donald Stufft <donald@stufft.io> wrote:
It seems like it would be reasonable for pathlib.Path to call fspath on the path passed to pathlib.Path.__init__, which would mean that if other libraries implemented __fspath__ then you could pass their path objects to pathlib and it would just work
and then any lib that needed a path, could simply wrap Path() around whatever was passed in. This is much like using np.array() if you want numpy arrays -- it works great. numpy is trickier because they are mutable and can be big, so you don't want to make a copy if you don't need to -- hence the np.asarray() function -- but Paths are immutable and far more lightweight. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Apr 07 2016, Donald Stufft <donald@stufft.io> wrote:
On Apr 7, 2016, at 6:48 AM, Nikolaus Rath <Nikolaus@rath.org> wrote:
Does anyone anticipate any classes other than those from pathlib to come with such a method?
It seems like it would be reasonable for pathlib.Path to call fspath on the path passed to pathlib.Path.__init__, which would mean that if other libraries implemented __fspath__ then you could pass their path objects to pathlib and it would just work (and similarly, if they also called fspath it would enable interoperation between all of the various path libraries).
Indeed, but my question is: is this actually going to happen? Are there going to be other libraries that will implement __fspath__, and will there be demand for pathlib to support them? Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

On 04/09/2016 07:32 AM, Nikolaus Rath wrote:
On Apr 07 2016, Donald Stufft <donald@stufft.io> wrote:
On Apr 7, 2016, at 6:48 AM, Nikolaus Rath <Nikolaus@rath.org> wrote:
Does anyone anticipate any classes other than those from pathlib to come with such a method?
It seems like it would be reasonable for pathlib.Path to call fspath on the path passed to pathlib.Path.__init__, which would mean that if other libraries implemented __fspath__ then you could pass their path objects to pathlib and it would just work (and similarly, if they also called fspath it would enable interoperation between all of the various path libraries).
Indeed, but my question is: is this actually going to happen? Are there going to be other libraries that will implement __fspath__, and will there be demand for pathlib to support them?
There will be at least one. :) -- ~Ethan~

On 7 April 2016 at 11:48, Nikolaus Rath <Nikolaus@rath.org> wrote:
Why is:
path = getattr(obj, '__fspath__') if hasattr(obj, '__fspath__') else obj
better than
path = str(obj) if isinstance(obj, pathlib.Path) else obj
One reason is that the former doesn't need you to import pathlib, which is good if you need to work with older versions of Python that don't have pathlib at all (yes, it's just some standard conditional import boilerplate, but it's additional messiness). Paul

On Thu, Apr 7, 2016 at 9:44 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
Excellent! Narrowing the field then to:
__fspath__
__os_path__
Step right up! Cast yer votes!
+0.9 for __fspath__; I'd prefer a one-word name, but with __path__ out of the running (which I agree with), there's no other obvious word. __fspath__ is a close second. -1 for __os_path__, unless it's reasonable to justify it as "most of the standard library uses Path objects, but os.path uses strings, so before you pass a Path to anything in os.path, you call path.ospath() on it, which calls __os_path__()". And that seems a bit hairy and roundabout; what it's _really_ doing is giving you back a string, and that has little to do with os.path. ChrisA

Chris Angelico wrote:
-1 for __os_path__, unless it's reasonable to justify it as "most of the standard library uses Path objects, but os.path uses strings, so before you pass a Path to anything in os.path, you call path.ospath() on it, which calls __os_path__()".
A less roundabout interpretation would be that it returns the path in a form that is directly acceptable to the OS. BTW, if __fspath__ is acceptable, __ospath__ (without the embedded _) should be as well. -- Greg

On Wed, Apr 6, 2016 at 3:46 PM, Brett Cannon <brett@python.org> wrote:
On Wed, 6 Apr 2016 at 15:22 Paul Moore <p.f.moore@gmail.com> wrote:
So I think we need a builtin.
Well, the ugliness shouldn't survive forever if the community shifts over to using pathlib while the built-in will. We also don't have a built-in for __index__() so it depends on whether we expect this sort of thing to be the purview of library authors or if normal people will be interacting with it (it's probably both during the transition, but I don't know afterwards).
For __index__ the "built-in" is: from operator import index -n -- Nathaniel J. Smith -- https://vorpus.org

On Wed, 6 Apr 2016 at 16:25 Nathaniel Smith <njs@pobox.com> wrote:
On Wed, Apr 6, 2016 at 3:46 PM, Brett Cannon <brett@python.org> wrote:
On Wed, 6 Apr 2016 at 15:22 Paul Moore <p.f.moore@gmail.com> wrote:
So I think we need a builtin.
Well, the ugliness shouldn't survive forever if the community shifts
using pathlib while the built-in will. We also don't have a built-in for __index__() so it depends on whether we expect this sort of thing to be
over to the
purview of library authors or if normal people will be interacting with it (it's probably both during the transition, but I don't know afterwards).
For __index__ the "built-in" is:
from operator import index
Which suggests perhaps we should have pathlib.fspath() instead of a built-in.

On 04/06/2016 04:26 PM, Brett Cannon wrote:
On Wed, 6 Apr 2016 at 16:25 Nathaniel Smith wrote:
For __index__ the "built-in" is:
from operator import index
Which suggests perhaps we should have pathlib.fspath() instead of a built-in.
+1 -- ~Ethan~

On Apr 6, 2016 6:31 PM, "Brett Cannon" <brett@python.org> wrote:
On Wed, 6 Apr 2016 at 16:25 Nathaniel Smith <njs@pobox.com> wrote:
On Wed, Apr 6, 2016 at 3:46 PM, Brett Cannon <brett@python.org> wrote:
On Wed, 6 Apr 2016 at 15:22 Paul Moore <p.f.moore@gmail.com> wrote:
So I think we need a builtin.
Well, the ugliness shouldn't survive forever if the community shifts
over to
using pathlib while the built-in will. We also don't have a built-in for __index__() so it depends on whether we expect this sort of thing to be the purview of library authors or if normal people will be interacting with it (it's probably both during the transition, but I don't know afterwards).
For __index__ the "built-in" is:
from operator import index
Which suggests perhaps we should have pathlib.fspath() instead of a built-in.
Would it make sense to instead have pathlib.Path.__init__?
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

On 04/06/2016 07:24 PM, Wes Turner wrote:
On Apr 6, 2016 6:31 PM, "Brett Cannon" wrote:
Which suggests perhaps we should have pathlib.fspath() instead of a built-in.
Would it make sense to instead have pathlib.Path.__init__?
We already have that -- it's what makes a Path. What we are looking for is a function that accepts a Path or a str and returns the Path as a str, or the str passed in. -- ~Ethan~

My mistake. On Wed, Apr 6, 2016 at 9:40 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/06/2016 07:24 PM, Wes Turner wrote:
On Apr 6, 2016 6:31 PM, "Brett Cannon" wrote:
Which suggests perhaps we should have pathlib.fspath() instead of a
built-in.
Would it make sense to instead have pathlib.Path.__init__?
We already have that -- it's what makes a Path.
What we are looking for is a function that accepts a Path or a str and returns the Path as a str, or the str passed in.
-- ~Ethan~
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

On 6 April 2016 at 23:46, Brett Cannon <brett@python.org> wrote:
str(path) will definitely work, path.__path__ will work if you're running the next set of bugfix releases. fspath(path) will only work in Python 3.6 and newer.
Ah, that was something I hadn't appreciated, that the builtin would be 3.6+ whereas the protocol would be added to current bugfix releases.
Maybe a compatibility library could add
try: fspath except NameError: try: import pathlib def fspath(p): if isinstance(p, pathlib.Path): return str(p) return p except ImportError: def fspath(p): return p
It's messy, like all compatibility code, but it allows code to use fspath(p) in older versions.
I would tweak it to check for __fspath__ before it resorted to calling str(), but yes, that could be something people use.
Yeah, the above code assumes that if the builtin isn't available, nor will the protocol be (see my misunderstanding above). Paul

On 04/06/2016 12:32 PM, Paul Moore wrote:
But I'm not one of the people who disliked using .path, so I'm probably not best placed to judge. It would be good if someone who *does* feel strongly could explain why fspath(pathobj) is better than pathobj.path.
fspath() would be useful because you can pass it a str or a Path and get a str back (or an exception if you pass the wrong thing in). Just like with Path you can pass a str or a Path get a Path back (or an exception if ...). -- -- ~Ethan~

-- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something’s wrong. http://kirbyfan64.github.io/ On Apr 6, 2016 12:28 PM, "Brett Cannon" <brett@python.org> wrote:
WIth Ethan volunteering to do the work to help make a path protocol a
thing -- and I'm willing to help along with propagating this through the stdlib where I think Serhiy might be interested in helping as well -- and a seeming consensus this is a good idea, it seems like this proposal has a chance of actually coming to fruition.
Now we need clear details. :) Some open questions are:
My votes:
Name: __path__, __fspath__, or something else?
Method or attribute? (changes what kind of one-liner you might use in
__path__. Considering everything related to `pathlib` uses the word `path`, __fspath__ seems kind of odd. libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build) Method. Using an attribute would be needlessly inconsistent.
Built-in? (name is dependent on #1 if we add one) Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it)
I agree; this would avoid lots of excess complexity.
Expand the C API to have something like PyObject_Path()?
-1. PyFileObject was already removed from Python 3; it seems useless to add another one.
Some people have asked for the pathlib PEP to have a more flushed out
reasoning as to why pathlib doesn't inherit from str. If Antoine doesn't want to do it I can try to instil my blog post into a more succinct paragraph or two and update the PEP myself.
Is this going to require a PEP or if we can agree on the points here are
we just going to do it? If we think it requires a PEP I'm willing to write it, but I obviously have no issue if we skip that step either. :)
Oh, and we should resolve this before the next release of Python 3.4,
3.5, or 3.6 so that pathlib can be updated in those releases.
-Brett
On Wed, 6 Apr 2016 at 08:09 Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/05/2016 11:57 PM, Nick Coghlan wrote:
On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com>
wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-)
But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
sys.path, for example.
That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on pathlib and scandir to add those attributes.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com

On Wed, 6 Apr 2016 at 12:29 Ryan Gonzalez <rymg19@gmail.com> wrote:
-- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something’s wrong. http://kirbyfan64.github.io/
On Apr 6, 2016 12:28 PM, "Brett Cannon" <brett@python.org> wrote:
WIth Ethan volunteering to do the work to help make a path protocol a
thing -- and I'm willing to help along with propagating this through the stdlib where I think Serhiy might be interested in helping as well -- and a seeming consensus this is a good idea, it seems like this proposal has a chance of actually coming to fruition.
Now we need clear details. :) Some open questions are:
My votes:
Name: __path__, __fspath__, or something else?
__path__. Considering everything related to `pathlib` uses the word `path`, __fspath__ seems kind of odd.
Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
Method. Using an attribute would be needlessly inconsistent.
Built-in? (name is dependent on #1 if we add one) Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it)
I agree; this would avoid lots of excess complexity.
Expand the C API to have something like PyObject_Path()?
-1. PyFileObject was already removed from Python 3; it seems useless to add another one.
But that was removing a custom object, not a function that will implement whatever idiom we come up with for getting the string representation of a path. -Brett
Some people have asked for the pathlib PEP to have a more flushed out
reasoning as to why pathlib doesn't inherit from str. If Antoine doesn't want to do it I can try to instil my blog post into a more succinct paragraph or two and update the PEP myself.
Is this going to require a PEP or if we can agree on the points here are
we just going to do it? If we think it requires a PEP I'm willing to write it, but I obviously have no issue if we skip that step either. :)
Oh, and we should resolve this before the next release of Python 3.4,
3.5, or 3.6 so that pathlib can be updated in those releases.
-Brett
On Wed, 6 Apr 2016 at 08:09 Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/05/2016 11:57 PM, Nick Coghlan wrote:
On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com>
wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could
"log"
possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-)
But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
sys.path, for example.
That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on pathlib and scandir to add those attributes.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com

On 04/06/2016 10:26 AM, Brett Cannon wrote:
2. Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
Having thought about this some more, it seems we have enough __dunder__ attributes that are plain strings that having this one also be a plain string should not be a problem: - __name__ - __module__ - __file__ Since Paths are immutable the __os_path__ attribute isn't going to change and doesn't need to be a method. -- ~Ethan~

On 04/06/2016 07:26 PM, Brett Cannon wrote:
WIth Ethan volunteering to do the work to help make a path protocol a thing -- and I'm willing to help along with propagating this through the stdlib where I think Serhiy might be interested in helping as well -- and a seeming consensus this is a good idea, it seems like this proposal has a chance of actually coming to fruition.
Now we need clear details. :) Some open questions are:
Throwing in my 2 bikesheds here, not having read all subthreads:
1. Name: __path__, __fspath__, or something else?
__path__ is already taken as a module attribute, so I would avoid it. __fspath__ is fine with me, although the more explicit variants are also ok. It's not like you need to read/write it constantly (that's the goal).
2. Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
An attribute would be somewhat inconsistent with the special-method lookup rules (looked up on the type, not the instance), so a method is probably a better choice.
3. Built-in? (name is dependent on #1 if we add one)
I don't think it warrants a builtin. I'd place it as a function in pathlib.
4. Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it)
+1.
5. Expand the C API to have something like PyObject_Path()?
+1 (with _Py_ at first) since you're going to need it in a lot of C functions. Georg

On Apr 7, 2016 1:22 AM, "Georg Brandl" <g.brandl@gmx.net> wrote:
On 04/06/2016 07:26 PM, Brett Cannon wrote:
1. Name: __path__, __fspath__, or something else?
__path__ is already taken as a module attribute, so I would avoid it. __fspath__ is fine with me, although the more explicit variants are also ok. It's not like you need to read/write it constantly (that's the goal).
+1 I also think that __ospath__ may be more correct since it is an OS-dependent representation, e.g. slash vs. backslash.
2. Method or attribute? (changes what kind of one-liner you might use
in
libraries, but I think historically all protocols have been methods
and the
serialized string representation might be costly to build)
An attribute would be somewhat inconsistent with the special-method lookup rules (looked up on the type, not the instance), so a method is probably a better choice.
I was just about to point this out. The deviation by pickle (lookup on instance rather than type) has been a source of pain.
3. Built-in? (name is dependent on #1 if we add one)
I don't think it warrants a builtin. I'd place it as a function in
pathlib. +1
4. Add the method/attribute to str? (I assume so, much like
__index__() is on
int, but I have not seen it explicitly stated so I would rather
clarify it)
+1.
+1
5. Expand the C API to have something like PyObject_Path()?
+1 (with _Py_ at first) since you're going to need it in a lot of C
functions. +1 -eric

On 7 April 2016 at 03:26, Brett Cannon <brett@python.org> wrote:
WIth Ethan volunteering to do the work to help make a path protocol a thing -- and I'm willing to help along with propagating this through the stdlib where I think Serhiy might be interested in helping as well -- and a seeming consensus this is a good idea, it seems like this proposal has a chance of actually coming to fruition.
Now we need clear details. :) Some open questions are:
Name: __path__, __fspath__, or something else?
__fspath__
Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
Method, as long as there's a helper function somewhere
Built-in? (name is dependent on #1 if we add one)
os.fspath (alongside os.fsencode and os.fsdecode) (Putting this in a module low in the dependency stack makes it easy for other modules to access without pulling in all of pathlib's dependencies)
Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it)
Makes sense
Expand the C API to have something like PyObject_Path()?
PyUnicode_FromFSPath, perhaps? The return type is well-defined here, so it can be done as an alternate constructor, and the C API counterparts of os.fsdecode and os.fsencode are PyUnicode functions (specifically PyUnicode_DecodeFSDefault and PyUnicode_EncodeFSDefault)
Some people have asked for the pathlib PEP to have a more flushed out reasoning as to why pathlib doesn't inherit from str. If Antoine doesn't want to do it I can try to instil my blog post into a more succinct paragraph or two and update the PEP myself.
Is this going to require a PEP or if we can agree on the points here are we just going to do it? If we think it requires a PEP I'm willing to write it, but I obviously have no issue if we skip that step either. :)
It's worth summarising in a PEP at least for communications purposes - very easy for folks that don't follow python-dev to miss otherwise. Plus my specific API suggestions are pretty different from Ethan's :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 04/08/2016 02:50 AM, Nick Coghlan wrote:
Built-in? (name is dependent on #1 if we add one)
os.fspath (alongside os.fsencode and os.fsdecode)
I like this better.
Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it)
Makes sense
What will this do? Return a Path or a str? I don't think we need either.
Expand the C API to have something like PyObject_Path()?
PyUnicode_FromFSPath, perhaps? The return type is well-defined here, so it can be done as an alternate constructor, and the C API counterparts of os.fsdecode and os.fsencode are PyUnicode functions (specifically PyUnicode_DecodeFSDefault and PyUnicode_EncodeFSDefault)
So this will do the same thing as os.fspath() at the C level, yes?
It's worth summarising in a PEP at least for communications purposes - very easy for folks that don't follow python-dev to miss otherwise. Plus my specific API suggestions are pretty different from Ethan's :)
*sigh* Okay. -- ~Ethan~

On Fri, 8 Apr 2016 at 08:33 Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/08/2016 02:50 AM, Nick Coghlan wrote:
Built-in? (name is dependent on #1 if we add one)
os.fspath (alongside os.fsencode and os.fsdecode)
I like this better.
Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it)
Makes sense
What will this do? Return a Path or a str? I don't think we need either.
When I brought this up it was to return self.
Expand the C API to have something like PyObject_Path()?
PyUnicode_FromFSPath, perhaps? The return type is well-defined here, so it can be done as an alternate constructor, and the C API counterparts of os.fsdecode and os.fsencode are PyUnicode functions (specifically PyUnicode_DecodeFSDefault and PyUnicode_EncodeFSDefault)
So this will do the same thing as os.fspath() at the C level, yes?
Yes.
It's worth summarising in a PEP at least for communications purposes - very easy for folks that don't follow python-dev to miss otherwise. Plus my specific API suggestions are pretty different from Ethan's :)
*sigh* Okay
Chris Angelico and I have been asked by Guido to work together to come up with a proposal after all the discussions are finished and it will most likely be a patch to the pathlib PEP.

On 04/08/2016 08:41 AM, Brett Cannon wrote:
On Fri, 8 Apr 2016 at 08:33 Ethan Furman wrote:
Brett previously queried:
Add the method/attribute to str? (I assume so, much like __index__() is on int, but I have not seen it explicitly stated so I would rather clarify it)
What will this do? Return a Path or a str? I don't think we need either.
When I brought this up it was to return self.
Okay, thanks.
Chris Angelico and I have been asked by Guido to work together to come up with a proposal after all the discussions are finished and it will most likely be a patch to the pathlib PEP.
Cool. I wasn't looking forward to that part. -- ~Ethan~

On Fri, Apr 8, 2016 at 2:50 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 7 April 2016 at 03:26, Brett Cannon <brett@python.org> wrote:
Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
couldn't it be a property?
Method, as long as there's a helper function somewhere
what has the helper function got to do with whether it's a method or attribute (would we call a property an attribute here?)
Built-in? (name is dependent on #1 if we add one)
os.fspath (alongside os.fsencode and os.fsdecode)
(Putting this in a module low in the dependency stack makes it easy for other modules to access without pulling in all of pathlib's dependencies)
Iike that -- though still =0.5 on having one at all -- this is only going to be used by the stdlib and other path-using libraries, not user code -- is that that hard to call obj.__fspath__() ?
Add the method/attribute to str? (I assume so, much like __index__() is on
int, but I have not seen it explicitly stated so I would rather clarify it)
I thought the whole point off all this is that not any old string can be a path! (whereas any int can be an index). Unless we go with Chris A's suggestion that this be a more generic lossless string protocol, rather than just for paths.
It's worth summarising in a PEP at least for communications purposes - very easy for folks that don't follow python-dev to miss otherwise.
I'd say add it to the existing pathlib PEP -- along with the extra discussion of why Path does not inherit from str. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 04/08/2016 09:04 AM, Chris Barker wrote:
On Fri, Apr 8, 2016 at 2:50 AM, Nick Coghlan wrote:
Method, as long as there's a helper function somewhere
what has the helper function got to do with whether it's a method or attribute (would we call a property an attribute here?)
Built-in? (name is dependent on #1 if we add one)
os.fspath (alongside os.fsencode and os.fsdecode)
[...] this is only going to be used by the stdlib and other path-using libraries, not user code -- is that that hard to call obj.__fspath__() ?
1) user code may call it 2) folks who write libraries are still users ;) 3) using __dunder__s directly is usually poor form.
I thought the whole point off all this is that not any old string can be a path! (whereas any int can be an index). Unless we go with Chris A's suggestion that this be a more generic lossless string protocol, rather than just for paths.
That does seem to be a valid point against str.__fspath__. -- ~Ethan~

On Fri, 8 Apr 2016 at 09:39 Ethan Furman <ethan@stoneleaf.us> wrote:
On 04/08/2016 09:04 AM, Chris Barker wrote:
On Fri, Apr 8, 2016 at 2:50 AM, Nick Coghlan wrote:
Method, as long as there's a helper function somewhere
what has the helper function got to do with whether it's a method or attribute (would we call a property an attribute here?)
Built-in? (name is dependent on #1 if we add one)
os.fspath (alongside os.fsencode and os.fsdecode)
[...] this is only going to be used by the stdlib and other path-using libraries, not user code -- is that that hard to call obj.__fspath__() ?
1) user code may call it 2) folks who write libraries are still users ;) 3) using __dunder__s directly is usually poor form.
I thought the whole point off all this is that not any old string can be a path! (whereas any int can be an index). Unless we go with Chris A's suggestion that this be a more generic lossless string protocol, rather than just for paths.
That does seem to be a valid point against str.__fspath__.
Yep, and I'm expecting we won't want that at this point. The fact that paths need strings for low-level OS stuff is a historical and technical detail, so no need to drag the entire str type into it if we can provide a reasonable helper function (for either the ABC or magic method solution).

On Fri, Apr 8, 2016 at 8:34 PM, Brett Cannon <brett@python.org> wrote:
On Fri, 8 Apr 2016 at 09:39 Ethan Furman <ethan@stoneleaf.us> wrote:
I thought the whole point off all this is that not any old string can be a path! (whereas any int can be an index). Unless we go with Chris A's suggestion that this be a more generic lossless string protocol, rather than just for paths.
That does seem to be a valid point against str.__fspath__.
Yep, and I'm expecting we won't want that at this point. The fact that paths need strings for low-level OS stuff is a historical and technical detail, so no need to drag the entire str type into it if we can provide a reasonable helper function (for either the ABC or magic method solution).
I'm not sure I understand what these points are about. Anyway, disallowing str or bytes as pathnames will break backwards compatibility if done at some point in the future. There's no way around that. But regarding all this talk of mine about bytes is because it has not been completely clear to me if something can break when converting a bytes path to str. I did originally propose guaranteeing a str, but I am so far only 85% convinced that that does not cause any problems. I understand that fsencode(fsdecode(bytes_path)) should always be equal to bytes_path. But can some other path operations fail when there are surrogates in the strings? And again, not to forget DirEntry, which may have a byte string path. Either way, I suppose os.fspath should accept anything that has __fspath__ or is a str or bytes (whether these have the dunder method or not). Then the options are either to return Union[str, bytes] or to always return str. And if the latter does not cause any problems, I like it way better, and it seems others would do too. And in that case it would probably be time to deprecate bytes paths on posix too (on Windows, this is already the case). But do we know that converting all paths to str does not cause any problems? -Koos

On Fri, 8 Apr 2016 at 14:23 Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Fri, 8 Apr 2016 at 09:39 Ethan Furman <ethan@stoneleaf.us> wrote:
I thought the whole point off all this is that not any old string can be a path! (whereas any int can be an index). Unless we go with Chris A's suggestion that this be a more generic lossless string protocol, rather than just for paths.
That does seem to be a valid point against str.__fspath__.
Yep, and I'm expecting we won't want that at this point. The fact that
On Fri, Apr 8, 2016 at 8:34 PM, Brett Cannon <brett@python.org> wrote: paths
need strings for low-level OS stuff is a historical and technical detail, so no need to drag the entire str type into it if we can provide a reasonable helper function (for either the ABC or magic method solution).
I'm not sure I understand what these points are about.
It means we most likely won't add a new method to str in regards to this proposal.
Anyway, disallowing str or bytes as pathnames will break backwards compatibility if done at some point in the future. There's no way around that.
No one is proposing disallowing str or bytes for a pre-existing API that supports either. The whole point of this is to make APIs work with strings and pathlib.
But regarding all this talk of mine about bytes is because it has not been completely clear to me if something can break when converting a bytes path to str. I did originally propose guaranteeing a str, but I am so far only 85% convinced that that does not cause any problems.
Depends on your definition of "problem". If you somehow blindly converted a bytes object representing a path to a str without knowing its encoding you will definitely break someone silently (and even os.fsdecode() isn't fool-proof thanks to multiple encodings on a single file system).
I understand that fsencode(fsdecode(bytes_path)) should always be equal to bytes_path. But can some other path operations fail when there are surrogates in the strings? And again, not to forget DirEntry, which may have a byte string path.
At this point no one wants to touch bytes paths. If you need that level of control because of multiple encodings within a single file system then you will probably have to stick with managing bytes paths on your own to get the encoding right. And just because DirEntry supports bytes doesn't mean that any magic method it gains has to carry that forward (it can always raise a TypeError if necessary).
Either way, I suppose os.fspath should accept anything that has __fspath__ or is a str or bytes (whether these have the dunder method or not).
Maybe. I'm not sure if we will want to down that route of both bytes and str being supported out of the same function as that gets messy quickly. The main reason os.scandir() supports it is so it can be a drop-in replacement for os.listdir(). It really depends on how we choose to structure the function in terms of just doing the right thing for objects that follow the protocol or if we want to introduce some required structure for the resulting path and implement some type guarantees so you have a better idea of what you will be working with after calling the function.
Then the options are either to return Union[str, bytes] or to always return str. And if the latter does not cause any problems, I like it way better, and it seems others would do too.
You don't have to convert byte paths to str, you can simply raise an exception in the face of them.
And in that case it would probably be time to deprecate bytes paths on posix too (on Windows, this is already the case).
Can't do that as Stephen Turnbull will tell you. :) At best we can marginalize the support of bytes-based paths to only low-level APIs exposed through the os package. -Brett

On Sat, Apr 9, 2016 at 12:53 AM, Brett Cannon <brett@python.org> wrote:
On Fri, 8 Apr 2016 at 14:23 Koos Zevenhoven <k7hoven@gmail.com> wrote:
At this point no one wants to touch bytes paths. If you need that level of control because of multiple encodings within a single file system then you will probably have to stick with managing bytes paths on your own to get the encoding right.
What does this mean? I assume you don't mean os.path.* would stop dealing with bytes? And if not, then you seem to mean that os.fspath would do nothing except call .__fspath__(). In that case, I think we should go back to it being an attribute (or property) and a variation of the now very famous idiom getattr(path, '__fspath__', path) and perhaps have os.fspath do exactly that.
And just because DirEntry supports bytes doesn't mean that any magic method it gains has to carry that forward (it can always raise a TypeError if necessary).
No, but what if some code gets pathnames from whatever other places and passes them on to os.scandir. Whenever it happens to get a bytes path, a TypeError gets raised, but only when it picks one of the DirEntry objects and for instance tries to open(...) it. Of course, I'm not sure how common this is.
It really depends on how we choose to structure the function in terms of just doing the right thing for objects that follow the protocol or if we want to introduce some required structure for the resulting path and implement some type guarantees so you have a better idea of what you will be working with after calling the function.
Do you have an example of potential 'required structure'?
Then the options are either to return Union[str, bytes] or to always return str. And if the latter does not cause any problems, I like it way better, and it seems others would do too.
You don't have to convert byte paths to str, you can simply raise an exception in the face of them.
I thought the point was for existing APIs to start supporting path objects, wouldn't raising an exception break the API? -Koos

On 04/08/2016 04:05 PM, Koos Zevenhoven wrote:
On Sat, Apr 9, 2016 at 12:53 AM, Brett Cannon wrote:
On Fri, 8 Apr 2016 at 14:23 Koos Zevenhoven wrote:
At this point no one wants to touch bytes paths. If you need that level of control because of multiple encodings within a single file system then you will probably have to stick with managing bytes paths on your own to get the encoding right.
What does this mean? I assume you don't mean os.path.* would stop dealing with bytes?
No, it does not mean that. It means the stuff in place won't change, but the stuff we're adding now to integrate with Path will only support str (which is one reason why os.path isn't going to die).
And if not, then you seem to mean that os.fspath would do nothing except call .__fspath__().
Fair point. So it should be something like this: def fspath(thing): # look for path attribute string = getattr(thing, '__fspath__', None) if string is not None: return string # not found, do we have a str or bytes object? if isinstance(thing, (str, bytes)): return thing raise TypeError('`thing` must implement the __fspath__ protocol or be an instance of str or bytes')
And just because DirEntry supports bytes doesn't mean that any magic method it gains has to carry that forward (it can always raise a TypeError if necessary).
No, but what if some code gets pathnames from whatever other places and passes them on to os.scandir. Whenever it happens to get a bytes path, a TypeError gets raised, but only when it picks one of the DirEntry objects and for instance tries to open(...) it. Of course, I'm not sure how common this is.
Yeah, I don't think this is a good idea. Given that fspath() should be able to return bytes if bytes are passed in, DirEntry's __fspath__ could return bytes to no ill effect. I realize this may not be ideal, but throwing bytes to the wind is going to bite us in the end. After all, the idea is to make these things work with the stdlib, and the stdlib accepts bytes for path strings. -- ~Ethan~

Ethan Furman writes:
It means the stuff in place won't change, but the stuff we're adding now to integrate with Path will only support str (which is one reason why os.path isn't going to die).
I don't think this is a reason for keeping os.path. (Backward compatibility with existing code is sufficient, of course.) Support of str for all file names is provided by PEP 383. ISTM there's no big loss to using PEP 383's 'surrogateescape' handler to allow un-decode- able filenames in pathlib.Path: they're very rare. AFAIK pathlib doesn't care about surrogates -- after all, they're entirely "consenting adults" stuff. Of course that detracts a bit from the attractiveness of pathlib.Path vs. os.path or bytes methods, but only for a use case most people won't encounter in practice. We continue to support bytes at the os/io/open level for the same reasons you added formatting back to bytes: there are times when it's as least as natural to work with bytes as str (eg, when the path is passed around without manipulation) and more convenient (eg, you don't have to deal with encodings and UnicodeError handling).
After all, the idea is to make these things work with the stdlib, and the stdlib accepts bytes for path strings.
I don't see a problem. In dealing with legacy data (archives that include paths, such as .zips and .isos) we may find un-decode-able paths, or paths that are decode-able but by undetermined encoding, for a while to come (decades). For those, the bytes interfaces are preferable to unlovely expedients like decoding as 'iso8859-1'. But those are specialized use cases. Sane people dealing with current file systems won't need bytes in pathlib, and most "out of bounds" uses for pathlib I can think of in my own experience will be able to use surrogateescape.

On Fri, 8 Apr 2016 at 09:05 Chris Barker <chris.barker@noaa.gov> wrote:
On Fri, Apr 8, 2016 at 2:50 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 7 April 2016 at 03:26, Brett Cannon <brett@python.org> wrote:
Method or attribute? (changes what kind of one-liner you might use in libraries, but I think historically all protocols have been methods and the serialized string representation might be costly to build)
couldn't it be a property?
A property is a method pretending to be an attribute, so yes. :)
Method, as long as there's a helper function somewhere
what has the helper function got to do with whether it's a method or attribute (would we call a property an attribute here?)
Yes, a property is an attribute in this instance. And it somewhat tweaks how simple of a one-liner is needed which in turn makes the function either nearly redundant or helpful. With an attribute: getattr(path, '__ospath__', path) With a method: path.__ospath__() if hasattr(path, '__ospath__') else path
Built-in? (name is dependent on #1 if we add one)
os.fspath (alongside os.fsencode and os.fsdecode)
(Putting this in a module low in the dependency stack makes it easy for other modules to access without pulling in all of pathlib's dependencies)
Iike that -- though still =0.5 on having one at all -- this is only going to be used by the stdlib and other path-using libraries, not user code -- is that that hard to call obj.__fspath__() ?
With a function we can add some type checking so that you know you are getting back a string and not something else like an file descriptor int or something.
Add the method/attribute to str? (I assume so, much like __index__() is on
int, but I have not seen it explicitly stated so I would rather clarify it)
I thought the whole point off all this is that not any old string can be a path! (whereas any int can be an index). Unless we go with Chris A's suggestion that this be a more generic lossless string protocol, rather than just for paths.
The whole point is to not treat a path object like any old string. We still have to support a string someone created that is a valid path. Remember, what we're trying to avoid is people simply doing `str(path)` everywhere since that works with e.g. None.
It's worth summarising in a PEP at least for communications purposes - very easy for folks that don't follow python-dev to miss otherwise.
I'd say add it to the existing pathlib PEP -- along with the extra discussion of why Path does not inherit from str.
That's the plan. -Brett
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov

Please write a new PEP. The topic looks to be discussed since many months by many different people on different mailing list. A PEP is a good standard to take a decision and it became clear that a decision must be taken for pathlib. Victor

I like __fspath__ because it looks like os.fsencode() and os.fsdecode(). Please no builtin function, we have enough of them, but make sure that the __fspath__ is accepted in all functions expecting a filename. If you consider that a function would make your change simpler, I suggest to add os.fspath(): if isinstance(obj, str): return obj try: return obj.__fspath__ except AttributeError: raise TypeError(...) Victor

* +1 for __path__, __fspath__ (though I don't know what each does) * why not Text(basestring / bytestring) and pathlib.Path(Text)? * are there examples of cases where this cannot be? * if not, +1 for subclassing str/Text * where are the examples of method collisions between the str interface and the pathlib.Path interface? * str.__div__ is nonsensical * pathlib.Path.__div__ is super-useful On Apr 6, 2016 10:10 AM, "Ethan Furman" <ethan@stoneleaf.us> wrote:
On 04/05/2016 11:57 PM, Nick Coghlan wrote:
On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking
that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-)
But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
sys.path, for example.
That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on pathlib and scandir to add those attributes.
-- ~Ethan~ _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

On Wed, 6 Apr 2016 at 10:41 Wes Turner <wes.turner@gmail.com> wrote:
* +1 for __path__, __fspath__ (though I don't know what each does)
Returns a string representing a file system path.
* why not Text(basestring / bytestring) and pathlib.Path(Text)?
See the points about next() vs __next__()
* are there examples of cases where this cannot be?
I don't understand what you think "cannot be".
* if not, +1 for subclassing str/Text
* where are the examples of method collisions between the str interface and the pathlib.Path interface?
There aren't any and that's partially why some people wanted the str subclass to begin with. Please consider this thread a str-subclass-free zone. This line of discussion is to flesh out the proposal for a path protocol as a proposal against subclassing str, not to settle the whole discussion outright. If you want to continue to debate the subclassing-str side of this please use the other thread. -Brett
* str.__div__ is nonsensical * pathlib.Path.__div__ is super-useful
On Apr 6, 2016 10:10 AM, "Ethan Furman" <ethan@stoneleaf.us> wrote:
On 04/05/2016 11:57 PM, Nick Coghlan wrote:
On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking
that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-)
But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
sys.path, for example.
That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on pathlib and scandir to add those attributes.
-- ~Ethan~ _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

On Apr 6, 2016 12:47 PM, "Brett Cannon" <brett@python.org> wrote:
On Wed, 6 Apr 2016 at 10:41 Wes Turner <wes.turner@gmail.com> wrote:
* +1 for __path__, __fspath__ (though I don't know what each does)
Returns a string representing a file system path.
Why two methods? __uripath__? (scheme, host (port), path, query, fragment) so, not __uripath__ what would be the difference between __path__ and __fspath__?
* why not Text(basestring / bytestring) and pathlib.Path(Text)?
See the points about next() vs __next__()
Path(b'123') / u'456' similarly, Path(b'123') / UTF8 / UTF16
* are there examples of cases where this cannot be?
I don't understand what you think "cannot be".
What one recommends (path.py(str) / str(pathlib.Path()) + getattr) is distinct from what any given programmer chooses to do with their code.
* if not, +1 for subclassing str/Text
* where are the examples of method collisions between the str
interface and the pathlib.Path interface?
There aren't any and that's partially why some people wanted the str
subclass to begin with.
Please consider this thread a str-subclass-free zone. This line of
discussion is to flesh out the proposal for a path protocol as a proposal against subclassing str, not to settle the whole discussion outright. If you want to continue to debate the subclassing-str side of this please use the other thread. this seems to be a sudden, arbitrary distinction. are these proposals necessarily disjoint? so, adding getattr(path, '__path__', path) to stdlib and other code is going to prevent which edge cases (before os.path.normpath()* anyway) for which benefit? when do I do getattr(path, '__fspath__', path)?
-Brett
* str.__div__ is nonsensical * pathlib.Path.__div__ is super-useful
On Apr 6, 2016 10:10 AM, "Ethan Furman" <ethan@stoneleaf.us> wrote:
On 04/05/2016 11:57 PM, Nick Coghlan wrote:
On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com>
wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-)
But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
sys.path, for example.
That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on
ah, not .__add__() but .append() I suppose the request here is for the cases which would be prevented (that we need to learn to look for) pathlib and scandir to add those attributes.
-- ~Ethan~ _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

On Wed, 6 Apr 2016 at 14:03 Wes Turner <wes.turner@gmail.com> wrote:
On Apr 6, 2016 12:47 PM, "Brett Cannon" <brett@python.org> wrote:
On Wed, 6 Apr 2016 at 10:41 Wes Turner <wes.turner@gmail.com> wrote:
* +1 for __path__, __fspath__ (though I don't know what each does)
Returns a string representing a file system path.
Why two methods? __uripath__?
(scheme, host (port), path, query, fragment) so, not __uripath__
what would be the difference between __path__ and __fspath__?
There is no difference; we're trying to choose a name.
* why not Text(basestring / bytestring) and pathlib.Path(Text)?
See the points about next() vs __next__()
Path(b'123') / u'456'
similarly, Path(b'123') / UTF8 / UTF16
As other people pointed out on the other thread, while bytes paths do exist, we don't want to promote them as they are a mess to work with. -Brett
* are there examples of cases where this cannot be?
I don't understand what you think "cannot be".
What one recommends (path.py(str) / str(pathlib.Path()) + getattr) is distinct from what any given programmer chooses to do with their code.
* if not, +1 for subclassing str/Text
* where are the examples of method collisions between the str
interface and the pathlib.Path interface?
There aren't any and that's partially why some people wanted the str
subclass to begin with.
Please consider this thread a str-subclass-free zone. This line of
discussion is to flesh out the proposal for a path protocol as a proposal against subclassing str, not to settle the whole discussion outright. If you want to continue to debate the subclassing-str side of this please use the other thread.
this seems to be a sudden, arbitrary distinction.
are these proposals necessarily disjoint?
so, adding getattr(path, '__path__', path) to stdlib and other code is going to prevent which edge cases (before os.path.normpath()* anyway) for which benefit?
when do I do getattr(path, '__fspath__', path)?
-Brett
* str.__div__ is nonsensical * pathlib.Path.__div__ is super-useful
ah, not .__add__() but .append()
I suppose the request here is for the cases which would be prevented (that we need to learn to look for)
On Apr 6, 2016 10:10 AM, "Ethan Furman" <ethan@stoneleaf.us> wrote:
On 04/05/2016 11:57 PM, Nick Coghlan wrote:
On 6 April 2016 at 16:53, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com>
> I'd missed the existing precedent in DirEntry.path, so simply taking > that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could
"log"
possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-)
But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
sys.path, for example.
That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort.
If we decide upon __fspath__ (or __path__) I will do the work on
wrote: pathlib and scandir to add those attributes.
-- ~Ethan~ _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

On 04/06/2016 08:53 AM, Nathaniel Smith wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 6 April 2016 at 15:57, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 06.04.16 05:44, Nick Coghlan wrote:
The most promising option for that is probably "getattr(path, 'path', path)", since the "path" attribute is being added to pathlib, and the given idiom can be readily adopted in Python 2/3 compatible code (since normal strings and any other object without a "path" attribute are passed through unchanged). Alternatively, since it's a protocol, double-underscores on the property name may be appropriate (i.e. "getattr(path, '__path__', path)")
This was already discussed. Current conclusion is using the "path" attribute. See http://bugs.python.org/issue22570 .
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-). But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
-n
Python was in a similar situation with the .next method on iterators, which changed to __next__ in Python 3. PEP 3114 (which explains this change) says:
Code that nowhere contains an explicit call to a next method can nonetheless be silently affected by the presence of such a method. Therefore, this PEP proposes that iterators should have a __next__ method instead of a next method (with no change in semantics).
How well does that apply to path/__path__? That PEP also introduced the next() builtin. This suggests that a protocol with __path__/__fspath__ would need a corresponding path()/fspath() builtin.

On Wed, Apr 06, 2016 at 11:30:32AM +0200, Petr Viktorin wrote:
Python was in a similar situation with the .next method on iterators, which changed to __next__ in Python 3. PEP 3114 (which explains this change) says:
Code that nowhere contains an explicit call to a next method can nonetheless be silently affected by the presence of such a method. Therefore, this PEP proposes that iterators should have a __next__ method instead of a next method (with no change in semantics).
How well does that apply to path/__path__?
I think it's potentially the same. Possibly there are fewer existing uses of "obj.path" out there which conflict with this use, but there's at least one in the std lib: sys.path.
That PEP also introduced the next() builtin. This suggests that a protocol with __path__/__fspath__ would need a corresponding path()/fspath() builtin.
Not necessarily. Take a look at (say) dir(object()) and you'll see a few dunders that don't correspond to built-ins: __reduce__ and __reduce_ex__ are used by pickle; __sizeof__ is used by sys.getsizeof; __subclasshook__ is used by the ABC system; Another example is __trunc__ used by math.trunc(). So any such fspath function should stand on its own as a useful feature, not just because there's a dunder method __fspath__. -- Steve

On Apr 6, 2016 07:44, "Steven D'Aprano" <steve@pearwood.info> wrote:
On Wed, Apr 06, 2016 at 11:30:32AM +0200, Petr Viktorin wrote:
Python was in a similar situation with the .next method on iterators, which changed to __next__ in Python 3. PEP 3114 (which explains this change) says:
Code that nowhere contains an explicit call to a next method can nonetheless be silently affected by the presence of such a method. Therefore, this PEP proposes that iterators should have a __next__ method instead of a next method (with no change in semantics).
How well does that apply to path/__path__?
I think it's potentially the same. Possibly there are fewer existing uses of "obj.path" out there which conflict with this use, but there's at least one in the std lib: sys.path.
That PEP also introduced the next() builtin. This suggests that a protocol with __path__/__fspath__ would need a corresponding path()/fspath() builtin.
Not necessarily. Take a look at (say) dir(object()) and you'll see a few dunders that don't correspond to built-ins:
__reduce__ and __reduce_ex__ are used by pickle; __sizeof__ is used by sys.getsizeof; __subclasshook__ is used by the ABC system;
Another example is __trunc__ used by math.trunc().
So any such fspath function should stand on its own as a useful feature, not just because there's a dunder method __fspath__.
An even more precise analogy is provided by __index__, whose semantics are to provide safe casting to integer (the name is a historical accident), as opposed to __int__'s tendency to cast things to integer willy-nilly, including things that really shouldn't be silently accepted as integers. Basically __index__ is to __int__ as __(fs)path__ would be to __str__. There's an operator.index but no builtins.index. -n

On Wednesday, April 06, 2016 07:39, Steven D'Aprano wrote:
How well does that apply to path/__path__?
I think it's potentially the same. Possibly there are fewer existing uses of "obj.path" out there which conflict with this use, but there's at least one in the std lib: sys.path.
Somewhat ironically, also os.
import os.path getattr(os, "path") <module 'ntpath' from 'C:\\Python35\\lib\\ntpath.py'>

On Tue, Apr 05, 2016 at 11:53:05PM -0700, Nathaniel Smith wrote:
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc.
It's the down-side of duck-typing. It's all well and good accepting anything with a quack method, but not everything is that straight- forward: artist.draw() gunslinger.draw() I think that file system paths are important enough, and tricky enough, to justify their own protocol. I like Nick's suggestion of a special dunder method for converting path-like objects into paths, without the problems that str(x) has, or the risk of assuming that anything with a .path attribute refers to a file system path. (maze.path, garden.path, career.path perhaps?) -- Steve

On 04/05/2016 11:53 PM, Nathaniel Smith wrote:
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-). But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol.
A very good point, thank you. -- ~Ethan~

Nick Coghlan wrote:
I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me.
It's not quite the same thing, though. DirEntry.path takes something that is not a path (a DirEntry instance) and gives you a path representing it, so the name makes sense. But a Path instance is already "a path", so Path.path is weird. Path.str would make more sense. -- Greg

On Apr 06, 2016, at 12:44 PM, Nick Coghlan wrote:
The next challenge would then be to make a list of APIs to be updated for 3.6 to implicitly accept "rich path" objects via the agreed convention, with pathlib.PurePath used as a test class:
* open() * codecs.open() (et al) * io.* * os.path.* * other os functions * shutil.* * tempfile.* * shelve.* * csv.*
Aside from the name of the attribute (though I'm partial to __path__), I think this would go a long way toward making path objects nicer to work with. And right, it doesn't have to be 100% but this would be a big improvement. Cheers, -Barry

On 6 April 2016 at 00:45, Guido van Rossum <guido@python.org> wrote:
This does sound like it's the crucial issue, and it is worth writing up clearly the pros and cons. Let's draft those lists in a thread (this one's fine) and then add them to the PEP. We can then decide to:
- keep the status quo - change PurePath to inherit from str - decide it's never going to be settled and kill pathlib.py
(And yes, I'm dead serious about the latter, rather Solomonic option.)
By the way, even if there's no solution that satisfies everyone to the "inherit from str" question, I'd still be unhappy if pathlib disappeared from the stdlib. It's useful for quick admin scripts that don't justify an external dependency. Those typically do quite a bit of path manipulation, and as such benefit from the improved API of pathlib over os.path. +1 on making (and documenting) a final decision on the "inherit from str" question -1 on removing pathlib just because that decision might not satisfy everyone Paul

On Tue, 5 Apr 2016 at 15:55 Guido van Rossum <guido@python.org> wrote:
It's been provisional since 3.4. I think if it is still there in 3.6.0 it should be considered no longer provisional. But this may indeed be a test case for the ultimate fate of provisional modules -- should we remove it?
I have to admit I got tired of the discussions and muted them all.
:) I figured. I was close myself until I decided to be the "not inheriting from str is a sane decision" camp because people weren't understanding where the design decision probably came from, hence http://www.snarky.ca/why-pathlib-path-doesn-t-inherit-from-str .
Personally I am not worried about the light use (I always expected it would take a long time to get adoption)
Ditto. My expectation/hope is that once we stop having it be provisional and we start using it in the stdlib then usage will pick up, especially if libraries pick up the `getattr(path, 'path', path)` idiom as an easy transition technique until they decide to drop support for str-based paths. The main motivation of this email is actually to have newcomers to the sprints at PyCon US sprint on adding support for pathlib (after we add "path-like object" to the glossary to say something like "a `str` object or an object that has a `path` attribute that itself is a `str`").
but I am worried about the hostility towards the module. My last/only comment in the discussion was about there possibly being a dichotomy between people who use Python for scripting and those who use it to write more substantial programs (I'm trying not to judge one group more important than another -- I'm just observing there seem to be these two groups). But I didn't stick around long enough to watch for responses to this idea.
Nope, no response (as Alexander pointed out).
Would making it inherit from str cause most hostility to disappear?
Probably. Most people were upset with pathlib because they couldn't use it immediately with all of the third-party libraries out there on top of the stdlib because adoption has been so low. Now if we make a concerted effort to accept pathlib in the stdlib then this may be the kick in the pants that it takes to start getting people to accept it externally and the transition band-aid of inheriting from str may not be needed. To me it seems to basically be a question of whether people can be patient during a transition and embrace pathlib over time or if they will simply refuse to add support in libraries and refuse to use `getattr(path, 'path', path)` or `str(path)` in the mean time. Personally, if we can wait out the Python 3 transition I have no issue waiting on a transition like this that has no backward-compatibility issues and has a one-liner solution for adding shallow support (and thus is ripe for quick patches to projects). After the whole str thing the only other major topic was coming up with some easier way to produce pathlib.Path instances (e.g. the p-string suggestion). Nothing really came of those discussions that seemed concrete and reach consensus, though (I think that may have been where your scripting/substantial programming comment came from).
I'm sure there was a discussion about this when PEP 428 was originally proposed, and I recall I was strongly in the camp of "it should not inherit from str", but unfortunately the PEP has no mention of this discussion or even the stated reason.
https://www.python.org/dev/peps/pep-0428/#no-confusion-with-builtins is the best you get in the PEP. -Brett
--Guido
After a rather extensive discussion on python-ideas about
not inheriting from str, another point that came up was that the use of pathlib has been rather light. Unfortunately even the stdlib doesn't really use pathlib because it's currently marked as provisional (or at least
On Tue, Apr 5, 2016 at 3:41 PM, Brett Cannon <brett@python.org> wrote: pathlib.PurePath that's
why I haven't tried to use it where possible in importlib).
Do we have a plan of what is required to remove the provisional label from pathlib?
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)

I haven't really been following this discussion, but a couple of comments... On Tue, Apr 05, 2016 at 11:47:32PM +0000, Brett Cannon wrote:
http://www.snarky.ca/why-pathlib-path-doesn-t-inherit-from-str
Nice write-up, thanks. [...]
To me it seems to basically be a question of whether people can be patient during a transition and embrace pathlib over time or if they will simply refuse to add support in libraries and refuse to use `getattr(path, 'path', path)` or `str(path)` in the mean time.
Wait, what? Is that what the whole fuss is about? That some people refuse to call str(path) when passing a path object to a function that expects a string? Really? That's it? The mind boggles. -- Steve

On 04/05/2016 07:40 PM, Steven D'Aprano wrote:
On Tue, Apr 05, 2016 at 11:47:32PM +0000, Brett Cannon wrote:
To me it seems to basically be a question of whether people can be patient during a transition and embrace pathlib over time or if they will simply refuse to add support in libraries and refuse to use `getattr(path, 'path', path)` or `str(path)` in the mean time.
Wait, what? Is that what the whole fuss is about? That some people refuse to call str(path) when passing a path object to a function that expects a string?
No, Stephen, that is not what this is about. This is about the ugliness of code with str(path) this and str(path) that and let's not forget the Path(this_returned_string) and Path(that_returned_string), not to mention the frustrations of forgetting to cast a str to Path or a Path to str. It's about the horror of boiler-plate infecting our otherwise beautiful Python code. -- ~Ethan~

Ethan Furman writes:
No, Stephen, that is not what this is about.
Wrong Steven. Spelling matters in email too. And he's more worth paying attention to than I am. But I'll have my say anyway. ;-)
This is about the ugliness of code with str(path) this and str(path) that
-1 Not good enough. I wouldn't do it that often that "ugly" overrides the reasoning Brett presented, and if you do, I bet one or two personal helpers would clean up 95% of your cases. But see Nick's comment that "str(var)" is too permissive. I'll have to think about that, but my first take is he's right, and we need to do something about making use of Path more straightforward within the stdlib. Whatever that is, preferably would make life easier for 3rd party usage too, of course. Is error-checking within Path sufficiently robust in the light of "too permissive"? (I don't know exactly what I mean by that, but something like if "str(var_purporting_to_be_Path)" is too permissive, are we sure that "str(really_is_Path_var)" is "safe"? Apparently we haven't had a lot of beta testing.)
and let's not forget the Path(this_returned_string) and Path(that_returned_string),
But we don't object to (de)serializing dicts to (from) str (as JSON or pickle). I think Path vs. string is similarly different to justify saying so (especially when treating user input). Note, too, that based on discussion in that thread it seems likely that Path is likely to be inappropriate as an internal representation of URL.RFC3986.Path. Thus, strings that look like paths (as strings) actually will have multiple internal representations, similarly to the way that a dict can have multiple serializations. If representation transformation is not invertible, EIBTI says we need the "boilerplate". YMMV, but that's my take.

On 04/05/2016 10:40 PM, Stephen J. Turnbull wrote:
Ethan Furman writes:
No, Stephen, that is not what this is about.
Wrong Steven. Spelling matters in email too.
Yes, it absolutely does. My apologies.
-1 Not good enough. I wouldn't do it that often that "ugly" overrides the reasoning Brett presented [...]
But we don't object to (de)serializing dicts to (from) str (as JSON or pickle).
Amusingly enough, I don't have to deal with serializing dicts. :) However, as a comparison: imagine you had to transform your dict to JSON every time some function wanted a dict as input. And had to transform returned JSON strings in to dicts.
I think Path vs. string is similarly different to justify saying so (especially when treating user input). [...] Thus, strings that look like paths (as strings) actually will have multiple internal representations, similarly to the way that a dict can have multiple serializations.
I don't follow. When dealing with the file system one passes a string* representing the path of the object one wants -- pretty much the same string that was passed in to Path. -- ~Ethan~ * or bytes, but the same sameness, really.

Brett Cannon <brett <at> python.org> writes:
:) I figured. I was close myself until I decided to be the "not inheriting
from str is a sane decision" camp because people weren't understanding where the design decision probably came from, hence http://www.snarky.ca/why-pathlib-path-doesn-t-inherit-from-str That's a good write-up, thank you. Paths don't have to inherit str any more than IP addresses or any other thing that happens to be passed as a string in traditional APIs. On a concrete point, inheriting str would make the API a horrible, confusing, dangerous mess missing regular string semantics (concatenation with +, for example, or indexing) with path-specific semantics and various grey areas (should .split() have path semantics or str semantics? what is the rule and how are people supposed to remember it?). (of course, for PHP or Javascript programmers it may not sound like a problem. Let "adding" two IP addresses return the concatenation of their string representations...) Regards Antoine.

On 04/06/2016 02:41 AM, Antoine Pitrou wrote:
On a concrete point, inheriting str would make the API a horrible, confusing, dangerous mess missing regular string semantics (concatenation with +, for example, or indexing) with path-specific semantics and various grey areas (should .split() have path semantics or str semantics? what is the rule and how are people supposed to remember it?).
While I agree in principle..
(of course, for PHP or Javascript programmers it may not sound like a problem. Let "adding" two IP addresses return the concatenation of their string representations...)
Like if had a subnet of '192.168' and a host of '.11.16' and adding them together gave you '192.168.11.16'? (yeah, a bit weak) Or, more appropriately: a path of '/home/ethan/mystuff' + '_bak' so I can make a copy? Actually, that would be stuff = pathlib.Path('/home/ethan/mystuff') # no issue here backup_stuff = stuff.with_name(stuff.name + '_bak') # eww Sure, you can make the argument that `with_suffix('.bak')` is cleaner, but it is not up to the stdlib to micromanage my code. Oh, and I do not consort with PHP, and only do so with Javascript when forced. -- ~Ethan~

On 04/05/2016 03:55 PM, Guido van Rossum wrote:
It's been provisional since 3.4. I think if it is still there in 3.6.0 it should be considered no longer provisional. But this may indeed be a test case for the ultimate fate of provisional modules -- should we remove it?
We should either remove it or make the rest of the stdlib work with it. Currently, pathlib.*Paths are second-class citizens, and working with them is not significantly better than working with os.path.* simply because we have to cast to str every time we want to deal with any other part of the stdlib.
Would making it inherit from str cause most hostility to disappear?
I don't think that is necessary. The hostility (of which I have some) is because we can't do: app_root = Path(...) config = app_root/'settings.cfg' with open(config) as blah: # whatever It feels like instead of addressing this basic disconnect, the answer has instead been: add that to pathlib! Which works great -- until a user or a library gets this path object and tries to use something from os on it. To come at this from a different angle: Python now has Enum; it is arguable that Path is more important, or at least much more useful. We have IntEnum whose sole purpose in life is to make it possible to (mostly) seamlessly work with the stdlib and other libraries where ints are being used to represent enumerations; and in pathlib we have . . . absolutely nothing. We have the promise of great things and wonderful usability, but in reality we have just as much pain as before -- or more if we forget to str(path) somewhere. I said that pathlib.Path does not need to inherit from str, and I still think that; however, to be a good stepping stone / transitional library I think the pathlib backport does need to have its Paths inherit from str. -- ~Ethan~

On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
[...] we can't do:
app_root = Path(...) config = app_root/'settings.cfg' with open(config) as blah: # whatever
It feels like instead of addressing this basic disconnect, the answer has instead been: add that to pathlib! Which works great -- until a user or a library gets this path object and tries to use something from os on it.
I agree that asking for config.open() isn't the right answer here (even if it happens to work). But in this example, once 3.5.2 is out, the solution would be to use open(config.path), and that will also work when passing it to a library. Is it still unacceptable then? -- --Guido van Rossum (python.org/~guido)

On 04/05/2016 10:00 PM, Guido van Rossum wrote:
On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
[...] we can't do:
app_root = Path(...) config = app_root/'settings.cfg' with open(config) as blah: # whatever
It feels like instead of addressing this basic disconnect, the answer has instead been: add that to pathlib! Which works great -- until a user or a library gets this path object and tries to use something from os on it.
I agree that asking for config.open() isn't the right answer here (even if it happens to work). But in this example, once 3.5.2 is out, the solution would be to use open(config.path), and that will also work when passing it to a library. Is it still unacceptable then?
On the one hand that is definitely more palatable. On the other hand it doesn't address having the stdlib itself directly support Path. On the gripping hand this feels reminiscent of the arguments over bytes vs unicode, but without any of the "This is why unicode is better!" bits. Why is pathlib better than plain strings? - attribute access to different parts such as the dirname, the filename, the extension (suffix) - easy access to on-disk answers such as .exists(), .stat(), .chdir - easy creation/modification of Path objects What problem is it solving that makes the pain worth dealing with? - no idea This is an especially important point considering the str-derived Path libraries already out there that have the same advantages as pathlib, but none of the pain. -- ~Ethan~

On 6 April 2016 at 06:00, Guido van Rossum <guido@python.org> wrote:
On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
[...] we can't do:
app_root = Path(...) config = app_root/'settings.cfg' with open(config) as blah: # whatever
It feels like instead of addressing this basic disconnect, the answer has instead been: add that to pathlib! Which works great -- until a user or a library gets this path object and tries to use something from os on it.
I agree that asking for config.open() isn't the right answer here (even if it happens to work). But in this example, once 3.5.2 is out, the solution would be to use open(config.path), and that will also work when passing it to a library. Is it still unacceptable then?
My sense is that this will remain unacceptable to those people who have a problem here. The issue is not so much the ugliness of the code (in spite of the fact that this is what people focus on) but rather the disconnect between the mental model people have and the reality of the code they have to write. The basic idea behind pathlib.Path objects is that they represent a *path*. And when you call open, you should pass it a path. So (the argument goes) why should you have to convert the path you have (a Path object) to pass it to a function (like open) that requires a path argument? Making stdlib functions work with Path objects would fix a lot of the conceptual difficulties here. And it would also mean that (thanks to duck typing) a lot of 3rd party code would work without change, further alleviating the issue. But ultimately, there will still be code that needs changing to be aware of Path objects. The change is simple enough (patharg = str(patharg), or the getattr('path') approach) but it's a change in mental model (this time by library authors) and the benefit of the change is not sufficiently obvious. Inheriting from str is the commonly-proposed solution, because in practical terms it works. But it does so by mixing layers of abstraction in a way that is difficult to explain to someone who thinks of a "path" as an abstract object rather than as a (text? byte?) string. Ultimately, all that's happening is that the burden of keeping the abstractions separate is placed on the design, rather than being explicit in the code. But while I have no evidence that this is a problem, it does leave me with a nagging feeling that it "seems similar to the bytes/text issue". My feelings: - I'd *like* to push for the cleaner separation of abstractions that a "pure" Path object provides. - It does need library writers (and in particular the stdlib) to "buy into" the model and make changes to support Path objects - I don't have a huge problem with using str(p) or p.path as a workaround during the transition, but that's from the POV of throwaway scripting. I'm not sure I'd be so happy using the workaround in code that would need to be supported for a long time. - I'd rather compromise on principles than abandon the idea of a stdlib Path object - In practical terms, inheriting from str is probably fine. At least evidence from 3rd party path libraries indicates so. Paul

On 06.04.2016 07:00, Guido van Rossum wrote:
On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
[...] we can't do:
app_root = Path(...) config = app_root/'settings.cfg' with open(config) as blah: # whatever
It feels like instead of addressing this basic disconnect, the answer has instead been: add that to pathlib! Which works great -- until a user or a library gets this path object and tries to use something from os on it. I agree that asking for config.open() isn't the right answer here (even if it happens to work).
How come?
But in this example, once 3.5.2 is out, the solution would be to use open(config.path), and that will also work when passing it to a library. Is it still unacceptable then?
I think so. Although in this example I would prefer the shorter config.open alternative as I am lazy. I still cannot remember what the concrete issue was why we dropped pathlib the same day we gave it a try. It was something really stupid and although I hoped to reduce the size of the code, it was less readable. But it was not the path->str issue but something more mundane. It was something that forced us to use os[.path] as Path didn't provide something equivalent. Cannot remember..... Best, Sven

On 04/06/2016 01:47 PM, Sven R. Kunze wrote:
I still cannot remember what the concrete issue was why we dropped pathlib the same day we gave it a try. It was something really stupid and although I hoped to reduce the size of the code, it was less readable. But it was not the path->str issue but something more mundane. It was something that forced us to use os[.path] as Path didn't provide something equivalent. Cannot remember.....
I'm willing to guess that if you had been able to just call os.whatever(your_path_obj) it would have been at most a minor annoyance. -- ~Ethan~

Yeah, sure. But it was more like this on a single line: os.missing1(str(our_path.something1)) *** os.missing2(str(our_path.something1)) *** os.missing1(str(our_path.something1)) And then it started to get messy because you need to work on a single long line or you need to open more than one line. It was a simple thing actually. Like repeating the same calls to pathlib just because we need to switch to os.path.... I will ask my colleague if he remembers or if we can recover the code tommorrow... Best, Sven NOTE to myself: getting old, need to write down everything On 06.04.2016 23:03, Ethan Furman wrote:
On 04/06/2016 01:47 PM, Sven R. Kunze wrote:
I still cannot remember what the concrete issue was why we dropped pathlib the same day we gave it a try. It was something really stupid and although I hoped to reduce the size of the code, it was less readable. But it was not the path->str issue but something more mundane. It was something that forced us to use os[.path] as Path didn't provide something equivalent. Cannot remember.....
I'm willing to guess that if you had been able to just call
os.whatever(your_path_obj)
it would have been at most a minor annoyance.
-- ~Ethan~ _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de

Le 06/04/2016 22:47, Sven R. Kunze a écrit :
On 06.04.2016 07:00, Guido van Rossum wrote:
On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
[...] we can't do:
app_root = Path(...) config = app_root/'settings.cfg' with open(config) as blah: # whatever
It feels like instead of addressing this basic disconnect, the answer has instead been: add that to pathlib! Which works great -- until a user or a library gets this path object and tries to use something from os on it. I agree that asking for config.open() isn't the right answer here (even if it happens to work).
How come?
But in this example, once 3.5.2 is out, the solution would be to use open(config.path), and that will also work when passing it to a library. Is it still unacceptable then?
I think so. Although in this example I would prefer the shorter config.open alternative as I am lazy.
I still cannot remember what the concrete issue was why we dropped pathlib the same day we gave it a try. It was something really stupid and although I hoped to reduce the size of the code, it was less readable. But it was not the path->str issue but something more mundane. It was something that forced us to use os[.path] as Path didn't provide something equivalent. Cannot remember.....
Path objects don't have splitext() or and don't allow "string" / path. Those are the ones bugging me the most.
Best, Sven _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/desmoulinmichel%40gmail.c...

On 04/07/2016 03:50 AM, Michel Desmoulin wrote:
Path objects don't have splitext() or and don't allow "string" / path. Those are the ones bugging me the most.
--> Path('README.md') --> p = Path('README.md') # PosixPath('README.md') --> '/home/ethan' / p # PosixPath('/home/ethan/README.md') --> p.splitext() Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'PosixPath' object has no attribute 'splitext' So, yeah, no .splitext() -- ~Ethan~

On Thu, Apr 7, 2016 at 5:50 AM, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
Path objects don't have splitext() or and don't allow "string" / path. Those are the ones bugging me the most.
import pathlib p = '/some/test' / pathlib.Path('path') / 'file_with.ext' p PosixPath('/some/test/path/file_with.ext') p.parent, p.stem, p.suffix (PosixPath('/some/test/path'), 'file_with', '.ext')
-- Zach

Fair enough, I stand corrected for both points. Le 07/04/2016 18:13, Zachary Ware a écrit :
On Thu, Apr 7, 2016 at 5:50 AM, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
Path objects don't have splitext() or and don't allow "string" / path. Those are the ones bugging me the most.
import pathlib p = '/some/test' / pathlib.Path('path') / 'file_with.ext' p PosixPath('/some/test/path/file_with.ext') p.parent, p.stem, p.suffix (PosixPath('/some/test/path'), 'file_with', '.ext')

On Thu, Apr 7, 2016 at 3:50 AM, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
Path objects don't have splitext()
that is useful -- let's add it. (and others if need be) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Apr 05, 2016, at 09:29 PM, Ethan Furman wrote:
We should either remove it or make the rest of the stdlib work with it. Currently, pathlib.*Paths are second-class citizens, and working with them is not significantly better than working with os.path.* simply because we have to cast to str every time we want to deal with any other part of the stdlib.
This. I've tried to use them in a couple of projects and in many ways pathlib objects are nice to work with. But rarely can they be used exclusively. There are just too many other packages and APIs that use os.path and the two do not interoperate very well. That makes practical use of pathlib objects just too unwieldy for project-wide adoption. I don't know if inheriting them from str would fix this problem. I'm +0 on removing the provisional status of pathlib and in trying to figure out ways for them to work better with other libraries (both stdlib and 3rd party) that will continue to be os.path based for the foreseeable future. Cheers, -Barry

On Apr 5, 2016, at 3:55 PM, Guido van Rossum <guido@python.org> wrote:
It's been provisional since 3.4. I think if it is still there in 3.6.0 it should be considered no longer provisional. But this may indeed be a test case for the ultimate fate of provisional modules -- should we remove it?
I lean slightly towards for removal. Having worked through the API when it is first released, I find it to be highly forgettable (i.e. I have to re-read the docs each time I've revisited it). While I haven't seen any uptake in real code, there are occasional questions about it on StackOverflow, so we do know that there is at least some interest. I'm not sure that it needs to live in the standard library though. Raymond

On 06.04.16 01:41, Brett Cannon wrote:
After a rather extensive discussion on python-ideas about pathlib.PurePath not inheriting from str, another point that came up was that the use of pathlib has been rather light. Unfortunately even the stdlib doesn't really use pathlib because it's currently marked as provisional (or at least that's why I haven't tried to use it where possible in importlib).
Do we have a plan of what is required to remove the provisional label from pathlib?
The behavior of the Path.resolve() method likely should be changed with breaking backward compatibility. There is an open issue about this.
participants (36)
-
Alexander Belopolsky
-
Alexander Walters
-
Antoine Pitrou
-
Barry Warsaw
-
Brett Cannon
-
Chris Angelico
-
Chris Barker
-
Chris Barker - NOAA Federal
-
Donald Stufft
-
Eric Fahlgren
-
Eric Snow
-
Ethan Furman
-
Georg Brandl
-
Glenn Linderman
-
Greg Ewing
-
Gregory P. Smith
-
Guido van Rossum
-
INADA Naoki
-
Koos Zevenhoven
-
Michel Desmoulin
-
Nathaniel Smith
-
Nick Coghlan
-
Nikolaus Rath
-
Oleg Broytman
-
Paul Moore
-
Petr Viktorin
-
Raymond Hettinger
-
Ryan Gonzalez
-
Serhiy Storchaka
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy
-
Victor Stinner
-
Wes Turner
-
Zachary Ware