[Python-ideas] Making pathlib paths inherit from str
ethan at stoneleaf.us
Wed Apr 6 13:05:47 EDT 2016
On 04/06/2016 09:04 AM, Koos Zevenhoven wrote:
> Recently, I have spent quite a lot of time thinking about what should
> be done to improve the situation of pathlib. Last week, after all
> kinds of wild ideas and trying hard every night to prove to myself
> that duck-typing-like compatibility with other path-representing
> objects is sufficient, I noticed that I had failed to prove that and
> that I had gone around a full circle back to where I started: "why
> can't paths subclass ``str``?"
> In the "Working with Path objects: p-strings?" thread, I said I was
> working on a proposal. Since it's been several days already, I think i
> should post it here and get some feedback before going any further.
> Maybe I should have done that even earlier. Anyway, there are some
> rough edges, and I will need to add links to references etc.
> So, do not hesitate to give feedback or criticism, which is especially
> appreciated it you take the time to read through the whole thing first
Fair enough, and done! :)
Overall: excellent research and interesting ideas. However, before
spending too much more time on this realize that the fate of pathlib in
the stdlib is being discussed on Python-Dev, so you may want to move
your efforts over there.
> Proposal: Making path objects inherit from ``str``.
> There is prior art in subclassing the Python ``str`` type to build a
> path object type. Packages on PyPI (TODO: list more?) that do this
> include ``path.py`` and ``antipathy``.
An honorable mention for antipathy! :)
> Specification of standard library changes
> Overriding all ``str``-specific methods
Have to be careful here -- the os module uses some of those string
methods to work with string-paths.
> Overriding ``.__add__`` to disable adding two path objects together
> Optional enabling of string methods
> Since many APIs currently have functions or methods that return paths as
> strings, existing code may expect to have all string functionality
> available on the returned objects. While most users are unlikely to use
> much of the ``str`` functionality, a library function may want to
> explicitly allow these operations on a path object that it returns.
> Therefore, the overridden ``str`` methods can be enabled by setting a
> ``._enable_str_functionality`` method on a path object as follows:
> - ``pathobj._enable_str_functionality = True #`` -- Enable ``str``
> - ``pathobj._enable_str_functionality = 'warn' #`` -- Enable ``str``
> methods, but emit a ``FutureWarning`` with the message
> ``"str method '<name>' may be disabled on paths in future versions."``
> The warning will help the API users notice that the return value is no
> longer a plain path.
This would be good for the pathlib backport, which I think should
inherit from str/unicode.
> Helping interactive python tools and IDEs
> Interactive Python tools such as Jupyter are growing in popularity. When
> they use ``dir(...)`` to give suggestions for code completion, it is
> harmful to have all the disabled ``str`` methods show up in the list,
> even if they typically would raise exceptions. Therefore, the
> ``__dir__`` method should be overridden on ``PurePath`` to only show the
> methods that are meaningful for paths.
> Older Python versions
> The ``pathlib2`` module, which provides ``pathlib`` to pre-3.4 python
> versions, can also subclass ``str``, but it should by default have
> ``._enable_str_functionality = 'warn'`` or
> ``.enable_str_functionality = True``, because the stdlib in the older
> Python-versions is not compatible with paths that have ``str``
> functionality disabled.
The 'warn' setting would be preferable.
> The future of plain string paths and ``os.path``?
> It is possible to imagine having both ``os.path`` and ``pathlib``
> coexist, as long as they co-operate well. Potentially, things like
> ``open("filename.txt")`` with a plain string argument will always be
> accepted. However, if regardless of what people use Python for, they
> slowly adopt path objects as the way to represent a path, the support
> for plain string paths may be deprecated and eventually dropped.
I don't see the stdlib ever /not/ accepting string paths.
> Literal syntax for paths: p-strings?
I don't see this as necessary.
> Another way of turning a string into a path could be to have a ``.path``
> property on ``str`` objects that instantiates and returns a path object.
And definitely not this.
> Generalized Resource Locator addresses: a-strings? l-strings?
> A generalized concept may be valuable in the future, because the
> distinction between local and remote is getting more and more vague. As
> discussed in the python-ideas thread "URLs/URIs + pathlib.Path + literal
> syntax = ?", it is possible to quite reliably distinguish common types
> of URLs from filesystem paths.
This was disproved. Just about anything can be a posix-path.
More information about the Python-ideas