[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
Brett Cannon
brett at python.org
Mon Apr 18 17:40:37 EDT 2016
On Mon, 18 Apr 2016 at 12:26 Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Brett Cannon writes:
>
> > If we continue with the "str is an encoding of file paths",
>
> It's not. It's a representation, but not an encoding. In Python 3,
> encoding means a representation of a character string using bytes.
> It's using "encoding" generically for "representation" that makes your
> head hurt.
>
Well, it makes *your* head hurt; for me it helped clarify some things. :)
>
> > you can then build from "bytes is an encoding of str" to get a
> > pyramid of file path encodings: Path -> str -> bytes. I don't think
> > this is in any way a controversial view.
>
> Perhaps not. But it's not particularly useful. ;-) Here's the
> pyramid I think about:
>
> Path
> / \
> / \
> V V
> str <-> bytes
>
> That is, str and bytes are interchangeable *without* any knowledge of
> paths, which are on a higher level of complexity and abstraction.
> Although in pathlib, there's an assumption that paths are serialized
> to str which is (implicitly) serialized to bytes when talking to the
> OS, this is not necessarily true for other structured path classes, in
> particular it is not true for DirEntry (which is a "enhanced
> degenerate" path containing only one path segment but also other
> useful information about the filesystem object addressed)
>
> I haven't looked at Antipathy, but I would guess from Ethan's
> promotion of bytes paths and concern with efficiency that "bytes
> antipaths" do *not* "go through" str to get to bytes, they already are
> bytes (in the sense of class inheritance).
>
> > But that's when I realized that adding __fspath__ support to
> os.fsdecode()
> > and os.fsencode(), they become more coercion functions rather than
> > encoding/decoding functions. It also means that os.fspath() has a place
> > when you want to say "I only want to encode a file path to str" and
> avoid
> > the decode bit that os.fsdecode() would do
>
> I don't understand what you're trying to say here. fsdecode currently
> does not promise to decode anything, because it's polymorphic,
> accepting str and bytes. fsdecode and fsencode already *are* coercion
> functions.
>
And they will continue to be coercion functions. My point is that since
they coerce there is no way to use them in a way to dictate that you don't
want any str/bytes encoding/decoding to occur without checking the
arguments going into the function (i.e. "no guessing about encodings,
please"). By providing os.fspath() I can say that I do not, under any
circumstances, want someone to guess at the encoding some bytes path is
under to get me a string and instead I want to start and end entirely in a
world of strings. IOW os.fspath() lets me work in such a way that the
instant bytes are introduced into my code for file paths it triggers a
TypeError.
>
> It's this kind of semantic confusion and broken nomenclature that is
> *why* I dislike these polymorphic functions and objects so much. It
> is impossible to reason correctly about them. We're stuck with
> invoking "practicality" and muddling through. And the names mislead
> even experienced Pythonistas.
>
Yep, we are stuck with the names unless you want to propose a new name and
deprecate the old one.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160418/de4fee64/attachment-0001.html>
More information about the Python-Dev
mailing list