[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Ethan Furman ethan at stoneleaf.us
Mon Apr 11 14:28:22 EDT 2016

On 04/11/2016 10:36 AM, Brett Cannon wrote:
> On Mon, 11 Apr 2016 at 10:13 Ethan Furman wrote:

>> I'm not saying that bytes paths are common -- and if this was a
>> brand-new feature I wouldn't be pushing for it so hard;  however, bytes
>> paths are already supported and it seems to me to be much less of a
>> headache to continue the support in this new protocol instead of drawing
>> an artificial line in the sand.
> Headache for you? The stdlib? Library authors? Users of libraries? There
> are a lot of users of this who have varying levels of pain for this.

Yes, yes, maybe, maybe.  :)

>> Asked another way, what are we gaining by disallowing bytes in this new
>> way of getting paths versus the pain caused when bytes are needed and/or
>> accepted?
> Type consistency. E.g. if I pass in a DirEntry object into os.fspath()
> and I don't know what the heck I'm getting back then that can lead to
> subtle bugs [...]

> How about we take something from the "explicit is better than implicit"
> playbook and add a keyword argument to os.fspath() to allow bytes to
> pass through?
>    def fspath(path, *, allow_bytes=False):
>        if isinstance(path, str):
>            return path
>        # Allow bytearray?
>        elif allow_bytes and isinstance(path, bytes):
>            return path
>        try:
>            protocol = path.__fspath__()
>        except AttributeError:
>            pass
>        else:
>            # Explicit type check worth it, or better to rely on duck typing?
>            if isinstance(protocol_path, str):
>                return protocol_path
>        raise TypeError("expected a path-like object, str, or bytes (if
> allowed), not {type(path)}")

I think that might work.  We currently have four path related things: 
bytes, str, Path, DirEntry -- two are str-only, one is bytes-only, and 
one can be either.

I would write the above as:

   def fspath(path, *, allow_bytes=False):
         path = path.__fspath__()
      except AttributeError:
      if isinstance(path, str):
         return path
      elif allow_bytes and isinstance(path, bytes):
         return path
         raise SomeError()

> For DirEntry users who use bytes, they will simply have to pass around
> DirEntry.path which is not as nice as simply passing around DirEntry,

If we go with the above we allow DirEntry.__fspath__ to return bytes and 
still get type-consistency of str unless the user explicitly declares 
they're okay with getting either (and even then the field is narrowed 
from four possible source types (or more as time goes on) to two.

To recap, this would allow both str & bytes in __fspath__, but the 
fspath() function defaults to only allowing str through.

I can live with that.


More information about the Python-Dev mailing list