[Python-Dev] PEP 471: scandir(fd) and pathlib.Path(name, dir_fd=None)
victor.stinner at gmail.com
Tue Jul 1 15:01:26 CEST 2014
2014-07-01 14:26 GMT+02:00 Ben Hoyt <benhoyt at gmail.com>:
> Thanks, Victor.
> I don't have any experience with dir_fd handling, so unfortunately
> can't really comment here.
> What advantages does it bring? I notice that even os.listdir() on
> Python 3.4 doesn't have anything related to file descriptors, so I'd
> be in favour of not including support.
The idea is to make sure that you get files from the same directory.
Problems occur when a directory is moved or a symlink is modified.
- you're browsing /tmp/test/x as root (!), /tmp/copy/passwd is owned
by www user (website)
- you would like to remove the file "x": call unlink("/tmp/copy/passwd")
- ... but just before that, an attacker replaces the /tmp/copy
directory with a symlink to /etc
- you will remove /etc/passwd instead of /tmp/copy/passwd, oh oh
Using unlink("passwd", dir_fd=tmp_copy_fd), you don't have this issue.
You are sure that you are working in /tmp/copy directory.
You can imagine a lot of other scenarios to override files and read
Hopefully, the Linux rm commands knows unlinkat() sycall ;-)
haypo at selma$ mkdir -p a/b/c
haypo at selma$ strace -e unlinkat rm -rf a
unlinkat(5, "c", AT_REMOVEDIR) = 0
unlinkat(4, "b", AT_REMOVEDIR) = 0
unlinkat(AT_FDCWD, "a", AT_REMOVEDIR) = 0
+++ exited with 0 +++
We should implement a similar think in shutil.rmtree().
See also os.fwalk() which is a version of os.walk() providing dir_fd.
> We can always add it later.
I would prefer to discuss that right now. My proposition is to accept
an int for scandir() and copy the int into DirEntry.dir_fd. It's not
that complex :-)
The enhancement of the pathlib module can be done later. By the way, I
know that Antoine Pitrou wanted to implemented file descriptors in
pathlib, but the feature was rejected or at least delayed.
More information about the Python-Dev