[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
brett at python.org
Mon Apr 11 17:43:01 EDT 2016
On Mon, 11 Apr 2016 at 14:11 Ethan Furman <ethan at stoneleaf.us> wrote:
> On 04/11/2016 01:42 PM, Victor Stinner wrote:
> > 2016-04-11 21:00 GMT+02:00 Brett Cannon:
> >> I'm -0 on allowing __fspath__ to return bytes, but we can see what
> >> think.
> > With the PEP 383, a bytes filename can be stored as str using the
> > surrogateescape error handler. So DirEntry can convert a bytes path to
> > str using os.fsdecode().
> I am far from a unicode expert, but if I understand this correctly you
> are proposing that DirEntry.__whatever__ can always return a str using
> the surogateescape (SE) method.
> However, before this SE string can be used, it would need to be
> converted back to bytes, and with the same SE method, yes? And this has
> already been implemented in the stdlib?
> So my concern in such a case is what happens if we pass this SE string
> somewhere else: a UTF-8 file, or over a socket, or into a database?
> Does this have issues that we wouldn't face if we just used bytes?
This is my worry as well and why I have not proposed this kind of universal
normalizing of bytes paths using os.fsdecode() w/ surrogateescape. Doing
this sort of thing from the system boundary and documenting as such as PEP
383 proposed makes a bit more sense as the expectation is more controlled
and is a clear input boundary.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev