[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Koos Zevenhoven k7hoven at gmail.com
Mon Apr 18 17:58:59 EDT 2016


On Mon, Apr 18, 2016 at 5:03 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 04/18/2016 12:41 AM, Nick Coghlan wrote:
>
>> Given the variant you [Koos] suggested, what if we defined the API
>> semantics
>> like this:
>>
>>      # Offer the simplest possible API as the public vesion
>>      def fspath(pathlike) -> str:
>>          return os._raw_fspath(pathlike)
>>
>>      # Expose the complexity in the "private" variant
>>      def _raw_fspath(pathlike, *, output_types = (str,)) -> (str, bytes):
>>          # Short-circuit for instances of the output type
>>          if isinstance(pathlike, output_types):
>>              return pathlike
>>          # We'd have a tidier error message here for non-path objects
>>          result = pathlike.__fspath__()
>>          if not isinstance(result, output_types):
>>              raise TypeError("argument is not and does not provide an
>> acceptable pathname")
>>          return result
>
> My initial reaction was that this was overly complex, but after thinking
> about it a couple days I /really/ like it.  It has a reasonable default for
> the 99% real-world use-case, while still allowing for custom and exact
> tailoring (for the 99% stdlib use-case ;) .
>

While it does seem we finally might be nearly there :), this still
seems to need some further discussion.

As described in that long post of mine, I suppose some third-party
code may need the variations (A-C), while it seems that in the stdlib,
most places need (str, bytes), i.e. (A), except in pathlib, which
needs (str,), i.e. (B). I'm not sure what I think about making the
variations private, even if "hiding" the bytes version is, as I said,
an important role of the public function.

Except for that type hint, there is *nothing* in the function that
might mislead the user to think bytes paths are something important in
Python 3. It's a matter of documentation whether it "supports" bytes
or not. In fact, that function (assuming the name os.fspath) could now
even be documented to support this:

    patharg = os.fspath(patharg, output_types = (str, pathlib.PurePath))  # :-)

So are we still going to end up with two functions or can we deal with one?
What should the typehint be? Something new in typing.py? How about
FSPath[...] as follows:

FSPath[bytes]  # bytes-based pathlike, including bytes
FSPath[str]       # str-based pathlike, including str

pathstring = typing.TypeVar('pathstring', str, bytes)  # could be
extended with PurePath or some path ABC

So the above variation might become:

def fspathname(pathlike: FSPath[pathstring],
           *, output_types: tuple = (str,)) -> pathstring:
    # Short-circuit for instances of the output type
    if isinstance(pathlike, output_types):
        return pathlike
    # We'd have a tidier error message here for non-path objects
    result = pathlike.__fspath__()
    if not isinstance(result, output_types):
        raise TypeError("valid output type not provided via __fspath__")
    return result

And similar type hints would apply to os.path functions. For instance,
os.path.dirname:

def dirname(p: FSPath[pathstring]) -> pathstring:
    ...

This would say pathstring all over and not give anyone any ideas about
bytes, unless they know what they're doing.

Complicated? Yes, typing is. But I think we will need this kind of
hints for os.path functions anyway.

-Koos


More information about the Python-Dev mailing list