[Python-Dev] pathlib - current status of discussions

Thu Apr 14 13:22:55 EDT 2016

On 14 April 2016 at 17:46, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 04/14/2016 08:59 AM, Michael Mysinger via Python-Dev wrote:
>
>> I am saying that if os.path.join now accepts RichPath objects, and those
>> objects can return either str or bytes, then its much harder to reason
>> about
>> when I have all bytes or all strings. In essence, you will force me to
>> pre-
>> wrap all RichPath objects in either os.fsencode(os.fspath(path)) or
>> os.fsdecode(os.fspath(path)), just so I can reason about the type. And if
>> I
>> have to always do that wrapping then os.path.join doesn't need to accept
>> RichPath objects and call fspath at all.
>
>
> What many folks seem to be missing is that *you* (generic you) have control
> of your data.
>
> If you are not working at the bytes layer, you shouldn't be getting bytes
> objects because:
>
> - you specified str when asking for data from the OS, or
> - you transformed the incoming bytes from whatever external source
>   to str when you received them.

My experience is that (particularly with code that was originally
written for Python 2) "you have control of your data" is often an
illusion - bytes can appear in code from unexpected sources, and when
they do I'd rather see an error if I'm using code where I expect a
string. Certainly that's a bug in the code - all I'm saying is that it
fail early rather than late.

Having said this, I don't have an actual use case - but equally it
seems to me that our problem is that *nobody* does (yet) because
uptake of pathlib has been slow, thanks to limited stdlib support. My
view remains that we should get the (relatively simple and
uncontroversial) str support in place, and defer bytes support for
when we have experience with that.

I'd appreciate it if anyone can clarify why "gracefully extending" the
protocol to include bytes support at a later date isn't practical.
Paul