[Python-Dev] pathlib+os/shutil feedback

Sven R. Kunze srkunze at mail.de
Sun Apr 10 10:07:50 EDT 2016


I talked to my colleague. He didn't remember the concrete use-case, 
though, he instantly mentioned three possible things (no order of 
preference):

1) pathlib + mtime
2) os.walk and pathlib
3) creation/removal of paths

He wasn't too sure but I checked with the docs and his memories seemed 
to be correct:


-----

1) https://docs.python.org/3/library/pathlib.html#pathlib.Path.stat

High-level path objects should return high-level [insert type here] 
objects. Put differently, an API for retrieving time-stats as real 
date/time objects would be nice. I think that can be expanded to other 
pathlib methods as well, to make them less "os-wrapper"-like and provide 
added value.


-----

2) I remember a discussion on python-ideas about using "glob" or 
"rglob". However, when searching the docs for "walk" like in "os.walk" 
or for "iter", I don't find "glob"/"rglob". I can imagine ourselves 
(pathlib newbies back then), we didn't discover them.

It would be great if the docs could be improved like the following:

"""
Path.rglob(pattern)
Walk down a given path; a wrapper for "os.scandir"/"os.listdir". This is 
like calling glob() with “**” added in front of the given pattern:
"""

I think it would make "glob" and "rglob" more discoverable to new users.

NOTE: """ Using the “**” pattern in large directory trees may consume an 
inordinate amount of time.""" sounds not really encouraging. This is 
especially true for  "rglob" as it is defined as "like calling glob() 
with “**”".

That leads to wondering whether "rglob" performs slow globbing instead 
of a "os.scandir"/"os.listdir".

https://docs.python.org/3/library/pathlib.html#basic-use even promotes 
"glob" with "**" in the beginning which seems rather discouraging to use 
"rglob" as a fast alternative to "os.walk/scandir/listdir". Renaming 
"rglob"/adding a "scan" method would definitely help here.


-----

3) Again searching the docs for "create", "delete" (nothing found) and 
"remove", I found "Path.touch", "Path.rmdir" and "Path.unlink".

It would be great if we had an easy way to remove a complete subtree as 
with "shutil.rmtree". We mostly don't care if a directory is empty. We 
need the system to be in a state of "this path does not exist anymore".

Moreover, touching a file is good enough to "create" it if you don't 
care about changing its mtime. It you care about its mtime, it's a 
problem to "touch".

------


That's it for our issues with pathlib from the past. Additionally, I got 
two further observations:

A) pathlib tries to mimic/publish some low-level APIs to its users. 
"unlink" is not something people would expect to use when they want to 
"delete" or to "remove" a file or a directory. I know where the term 
stems from but it's the wrong layer of abstraction IMHO. Same for 
"touch" or "chmod".

B) "rename" vs "replace". The difference is not really clear from the 
docs. You need to read "Path.replace" in order to understand 
"Path.rename" completely. (one raises an exception, the other don't if I 
read it correctly).


If there's some agreement to change things with respect to those 5 
points, I am willing to put some time into it.


Best,
Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160410/ae882c9a/attachment.html>


More information about the Python-Dev mailing list