Hi Sven,

Thanks for your support and feedback.


On Thu, Dec 31, 2020, 07:23 Sven R. Kunze <srkunze@mail.de> wrote:

Hi Todd,

my comments below. Also would offer my time for reviewing/testing if wanted.


On 22.11.20 20:53, Todd wrote:
I know enhancements to pathlib gets brought up occasionally, but it doesn't look like anyone has been willing to take the initiative and see things through to completion.  I am willing to keep the ball rolling here and even implement these myself.  I have some suggestions and I would like to discuss them.  I don't think any of them are significant enough to require a pep.  These can be split it into independent threads if anyone prefers.

1. copy

The big one people keep bringing up that I strongly agree on is a "copy" method.  This is really the only common file manipulation task that currently isn't possible.  You can make files, read them, move them, delete them, create directories, even do less common operations like change owners or create symlinks or hard links. 

I really would appreciate that one. If I could through in another detail which we needed a lot:

- atomic_copy or copy(atomic=True) whatever form you prefer

It is not as easy to achieve as it may look on the first sight. Especially when it comes to tempfiles and permissions. The use cases of atomic copy included scenarios for multiple parallel access of files like caches in web development.


Is there already support for atomic writes in the standard library?  I am not planning on implementing anything new, only exposing existing functionality.  Adding atomic operations to the stslib would likely require a pep and substantial discussion of API and implementation.  I don't really have the background to do that.

A common objection is that pathlib doesn't work on multiple paths.  But that isn't the case.  There are a ton of methods that do that, including:

   * symlink_to
   * link_to
   * rename
   * replace
   * glob
   * rglob
   * iterdir
   * is_relative_to
   * relative_to
   * samefile
 
I think this is really the only common file operation that someone would need to switch to a different module to do, and it seems pretty strange to me to be able to make symbolic or hard links to a file but not straight up copy one.

2. recursive remove

This could be a "recursive" option to "rmdir" or a "rmtree" method (I prefer the option).  The main reason for this is symmetry.  It is possible to create a tree of folders (using "mkdir(parents=True)"), but once you do that you cannot remove it again in a straightforward way.

Importing shutil does not seem to be a big deal but I agree that it's somehow weird to be missing.

Correct me if I'm wrong, but os.path somehow is closer to OS-level operations whereas shutil basically provides all the missing convenience features that sh provided.

So, to me it boils down to the question if pathlib is a completely new paradigm. If so, then sure let's add it. Additionally, I like the "batteries included" theme of Python.


Pathlib already has a number of higher-level operations besides what is in os, 

Last but not least, I tend more towards the "rmtree" method just to make it crystal clear to everyone. Maybe docs could cross-refer both methods. Tree manipulations are inherently complicated and a lot can go wrong. Symmetry is not 100% given as you might delete more than what you've created (which was a single node path).


We already have tree removal functionality that this can use internally.

As for the name, one thing to consider is that making a recursive tree uses an argument.

And I think the argument would need to be keyword-only to avoid accidentally invoking it.
5. Stem with no suffixes

The stem property only takes off the last suffix, but even in the example given ('my/library.tar.gz') it isn't really useful because the suffix has two parts ('.tar' and '.gz').  I suggest another property, probably called "rootstem" or "basestem", that takes off all the suffixes, using the same logic as the "suffixes" property.  This is another symmetry issue: it is possible to extract all the suffixes, but not remove them.

+1

Does anybody rely of this behavior of ".stem"? It always seemed odd to me but that might be because of the use-cases I work with.

So, another possibility would be to fix "stem" to do what makes sense.


This is a backwards compatibility break and I don't want to get into the complications of doing that.  There is really no benefit to breaking backwards compatibility.  I would strongly suspect renaming a method then making a new, completely different method with the same name is not going to happen.  The burden is just too high relative to the benefits.

7. exist_ok for is_* methods

Currently all the is_* methods (such as is_file) return False if the file doesn't exist or if it is a broken symlink.  This can be dangerous, since it is not trivially easy to tell if you are dealing with the wrong type of file vs. a missing file.  And it isn't obvious behavior just from the method name.  I suggest adding an "exist_ok" argument to all of these, with the default being "True" for backwards-compatibility.  This argument name is already in use elsewhere in pathlib.  If this is False and the file is not present, a "FileNotFoundError" is raised.


+1

Maybe missing_ok could help more to make people understand what the parameter actually does.

exist_ok is used for creation methods (mkdir and touch). So, the name makes more sense in these context.

Yes, you are right. Someone else pointed out this issue too.