[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Ethan Furman ethan at stoneleaf.us
Thu Apr 14 11:25:04 EDT 2016

On 04/14/2016 07:52 AM, Stephen J. Turnbull wrote:
> Nick Coghlan writes:

>> The use case for returning bytes from __fspath__ is DirEntry, so you
>> can write things like this in low level code:
>>     def myscandir(dirpath):
>>         for entry in os.scandir(dirpath):
>>             if entry.is_file():
>>                 with open(entry) as f:
>>                     # do something
> Excuse me, but that is *not* a use case for returning bytes from
> DirEntry.__fspath__.  open() is perfectly happy taking str (including
> surrogate-encoded rawbytes).

Substitute open() with sending those bytes somewhere else: why should I 
have to reencode this str back to bytes, when bytes are what I asked for 
in the first place?

> If the trivial thing is for __fspath__
> to return bytes, then implicitly applying os.fsencode to the value
> being returned is almost as trivial, and just as safe.  A low price to
> pay for ensuring that text applications don't crash just because a
> bytes-oriented object decides to implement __fspath__.

How did this application get a bytes path object to begin with?  Either 
it explicitly used bytes when calling scandir and friends (in which case 
it shouldn't be surprised to be working with bytes); or it got that 
bytes object from a database, over-the-wire, an-other-language-lib, etc. 
  Those are the boundaries where bytes should be transformed to str if 
the app doesn't want to deal with bytes (whether for path manipulation 
or other text manipulation).  os.fspath() is not a boundary function and 
shouldn't be used as if it were.

> If there's any cost to defining __fspath__ as str-only, it's some
> other use case.  What consumer of __fspath__ that expects bytes but
> not str do you envision?  Is it generalizable, so that applying
> fsencode to the value of __fspath__ would lead to "unacceptably"
> widespread sprinkling of fsencode all over bytes-oriented code?

If I'm working with bytes, why would I want to work with str?  Python is 
a glue language, and Python practitioners don't always have the luxury 
of working only with text.


More information about the Python-Dev mailing list