On Sun, Nov 22, 2020 at 5:46 PM Chris Angelico <rosuav@gmail.com> wrote:

On Mon, Nov 23, 2020 at 6:54 AM Todd <toddrjen@gmail.com> wrote:
>
> I know enhancements to pathlib gets brought up occasionally, but it doesn't look like anyone has been willing to take the initiative and see things through to completion. I am willing to keep the ball rolling here and even implement these myself. I have some suggestions and I would like to discuss them. I don't think any of them are significant enough to require a pep. These can be split it into independent threads if anyone prefers.
>

Keep 'em in one thread for now, but if any of them become too
controversial, it's probably worth narrowing the scope and spinning
off the debatable ones in their own threads.

General principle, by the way: The operations that currently exist are
the fundamental primitives, and you're asking for higher-level
operations to be made available. That might be a good summary for the
proposal. (For example, renaming one thing to another is a primitive,
but copying a file generally means opening both names, reading and
writing, and then closing.)

I think even that is debatable. I would say "read_text", "read_bytes", "write_text", and "write_bytes" are higher-level operations on top of "open" in much the same way "copy" is. "glob" and "rglob" are also higher-level operations on top of iterdirs.

And as far as I can see that only really applies to "copy". "user" and "group" are really higher-level routines on top of the primitive "gid" and "uid", and the rest are meant to be counterparts of operations that already exist.

A few specifics:

> 1. copy
>
> The big one people keep bringing up that I strongly agree on is a "copy" method. This is really the only common file manipulation task that currently isn't possible. You can make files, read them, move them, delete them, create directories, even do less common operations like change owners or create symlinks or hard links.
>
> A common objection is that pathlib doesn't work on multiple paths. But that isn't the case. There are a ton of methods that do that, including:
>
> * symlink_to
> * link_to
> * rename
> * replace
> * glob
> * rglob
> * iterdir
> * is_relative_to
> * relative_to
> * samefile
>
> I think this is really the only common file operation that someone would need to switch to a different module to do, and it seems pretty strange to me to be able to make symbolic or hard links to a file but not straight up copy one.
>

I don't think it's so very strange (see above about primitive vs high
level), but it does seem a reasonable enhancement. (It'd need the same
caveats as on shutil.copy.)

As I said, I don't think this is any less primitive than "read_text", "read_bytes", "write_text", "write_bytes", "glob", or "rglob".

> 2. recursive remove
>
> This could be a "recursive" option to "rmdir" or a "rmtree" method (I prefer the option). The main reason for this is symmetry. It is possible to create a tree of folders (using "mkdir(parents=True)"), but once you do that you cannot remove it again in a straightforward way.
>

Absolutely agree, but not for the same reason: pruning a branch off a
directory tree is VERY easy to naively get wrong, and shutil.rmtree
has a lot of code in it to protect itself.

Another good point. The question is whether it should be its own method or an argument.

> 4. uid and gid
>
> You can get the owner and group name of a file (with the "owner" and "group" methods), but there is no easy way to get the corresponding number.

That does seem a strange omission. If the other proposals get bogged
down in controversy, spin this one off as its own thread, as I think
it shouldn't be difficult to add it.

Sure.

It might be worth looking at this as "making shutil support Path
objects", and then have the Path objects grow methods that delegate to
shutil. That'd avoid duplicating logic eg for rmtree and copyfile.

shutil already supports Path objects. And yes, I was planning to delegate the logic to existing functions there or in "os".