[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
Nick Coghlan
ncoghlan at gmail.com
Sat Apr 16 21:28:09 EDT 2016
On 16 April 2016 at 21:21, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Nick Coghlan writes:
>
> > On 15 April 2016 at 00:52, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> > > Nick Coghlan writes:
> > >
> > > > The use case for returning bytes from __fspath__ is DirEntry, so you
> > > > can write things like this in low level code:
> > > >
> > > > def myscandir(dirpath):
> > > > for entry in os.scandir(dirpath):
> > > > if entry.is_file():
> > > > with open(entry) as f:
> > > > # do something
> > >
> > > Excuse me, but that is *not* a use case for returning bytes from
> > > DirEntry.__fspath__. open() is perfectly happy taking str (including
> > > surrogate-encoded rawbytes).
> >
> > That results in a different type for the file object's name:
> >
> > >>> open("README.md").name
> > 'README.md'
> > >>> open(b"README.md").name
> > b'README.md'
>
> OK, you win, __fspath__ needs to be polymorphic.
>
> But you've just shifted me to -1 on "os.fspath": it's an attractive
> nuisance.
>
> EIBTI, applications and high-level library functions should
> use os.fsdecode or os.fsencode. Functions that take a polymorphic
> argument and want preserve type should invoke __fspath__ on the
> argument. That will visually signal that the caller is not merely
> low-level, but is explicitly a boundary function.
str and bytes aren't going to implement __fspath__ (since they're only
*sometimes* path objects), so asking people to call the protocol
method directly for any purpose would be a pain.
> (You could rename
> the generic function as "os._fspath", I guess, but I *really* want to
> deprecate calling the polymorphic version in user code. _fspath can
> be added if experience shows that polymorphic usage is very desireable
> outside the stdlib. This remark is in my not-so-Dutch opinion, of
> course.)
You may have missed my email where I agreed os.fspath() itself needs
to ensure the output is a str object and throw an exception otherwise.
The remaining API design debate relates to whether the polymorphic
version should be "os.fspath(obj, allow_bytes=True)" or
"os._raw_fspath(obj)" (with Ethan favouring the former, and me the
latter).
>
> > The guarantee we want to provide those folks is that if they're
> > operating in the binary domain they'll stay there.
>
> Et tu, Nick? "Guarantee"?! You can't guarantee any such thing with
> an implicitly invoked polymorphic API like this one -- unless you
> consider a crashed program to be in the binary domain. ;-)
I do, as one of the core changes in design philosophy between Python 2
and 3 is attempting to remove the implicit level shifting between the
binary and text domains, and instead throw exceptions in those cases.
Pragmatism requires us to keep some of them (e.g. the codecs module is
officially object<->object in both Python 2 and Python 3, and string
formatting codes can still do unexpected things), but a great many of
them are already gone, and we don't want to add any new ones if
alternative designs are available.
> Note that
> the current proposala don't even do that for the binary domain, only
> for the text domain!
Folks that want to ensure they're working in the binary domain can
already do "memoryview(obj)" to ensure they have a bytes-like object
without constraining it to a specific type.
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-Dev
mailing list