[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Brett Cannon brett at python.org
Mon Apr 11 13:36:33 EDT 2016

On Mon, 11 Apr 2016 at 10:13 Ethan Furman <ethan at stoneleaf.us> wrote:

> On 04/11/2016 09:32 AM, Zachary Ware wrote:
> > On Mon, Apr 11, 2016 at 11:18 AM, Ethan Furman wrote:
> >> If those examples are anywhere close to accurate, an fspath protocol
> that
> >> supported both bytes and str seems a lot easier to work with.
> >
> > But why are you working with bytes paths in the first place? Where did
> > you get them from, and why couldn't you decode them at that boundary?
> > In 7ish years of working with Python (almost exclusively Python 3) on
> > Windows and UNIX, I have never used bytes paths on any platform.
> I'm not saying that bytes paths are common -- and if this was a
> brand-new feature I wouldn't be pushing for it so hard;  however, bytes
> paths are already supported and it seems to me to be much less of a
> headache to continue the support in this new protocol instead of drawing
> an artificial line in the sand.

Headache for you? The stdlib? Library authors? Users of libraries? There
are a lot of users of this who have varying levels of pain for this.

> Also, let me be clear that the new protocol will not adversely affect my
> own library is it directly subclasses bytes and strings (bPath and
> uPath), so they will pass through either way (or be appropriately
> rejected if the function only supports str -- are there any?) .

Well, technically it depends on whether we prefer the protocol or explicit
type checking and how we define the protocol. If we say __ospath__ has to
return str and we check for that first then that would be bad for you. If
we do isinstance() checks before calling the protocol or allow both str and
bytes then we open it up.

> This kind of feels like PEP 361 again -- the vast majority of Python
> programmers do not need %-interpolation for bytes, but what a pain in
> the rear for those that did!  (Yes, I was one of those.)  Admittedly,
> the pain from this will not be nearly as severe as that was, but why
> should we have any unnecessary pain at all?
> Asked another way, what are we gaining by disallowing bytes in this new
> way of getting paths versus the pain caused when bytes are needed and/or
> accepted?

Type consistency. E.g. if I pass in a DirEntry object into os.fspath() and
I don't know what the heck I'm getting back then that can lead to subtle
bugs, especially when you didn't check ahead of time what DirEntry.path
was. To me, that bumps up against "In the face of ambiguity, refuse the
temptation to guess". Having the type vary even when the type doesn't can
get messy if you don't expect to always vary (i.e. this isn't getattr()).

>  From my point of view the pain of simply implementing this without
> bytes support in the existing os and os.path modules is not worth
> excluding bytes.

How about we take something from the "explicit is better than implicit"
playbook and add a keyword argument to os.fspath() to allow bytes to pass

  def fspath(path, *, allow_bytes=False):
      if isinstance(path, str):
          return path
      # Allow bytearray?
      elif allow_bytes and isinstance(path, bytes):
          return path
          protocol = path.__fspath__()
      except AttributeError:
          # Explicit type check worth it, or better to rely on duck typing?
          if isinstance(protocol_path, str):
              return protocol_path
      raise TypeError("expected a path-like object, str, or bytes (if
allowed), not {type(path)}")

For DirEntry users who use bytes, they will simply have to pass around
DirEntry.path which is not as nice as simply passing around DirEntry, but
it does allow them to continue to operate without having to decode the
bytes if allow_bytes is True. We get type consistency in the protocol fas
we can continue to expect people to return strings for __fspath__. And for
those APIs where supporting bytes won't be an issue, they can explicitly
choose to support bytes or not and then not have to juggle support for both
str and bytes if they choose not to. IOW consenting adults to bytes paths
can not get cut out and have a ton of hoops to jump through as long as they
opt-in, but those adults who don't consent to bytes paths have their lives
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160411/753a24ae/attachment-0001.html>

More information about the Python-Dev mailing list