[Python-Dev] PEP 471: scandir(fd) and pathlib.Path(name, dir_fd=None)

Victor Stinner victor.stinner at gmail.com
Tue Jul 1 15:01:26 CEST 2014


2014-07-01 14:26 GMT+02:00 Ben Hoyt <benhoyt at gmail.com>:
> Thanks, Victor.
>
> I don't have any experience with dir_fd handling, so unfortunately
> can't really comment here.
>
> What advantages does it bring? I notice that even os.listdir() on
> Python 3.4 doesn't have anything related to file descriptors, so I'd
> be in favour of not including support.

See https://docs.python.org/dev/library/os.html#dir-fd

The idea is to make sure that you get files from the same directory.
Problems occur when a directory is moved or a symlink is modified.
Example:

- you're browsing /tmp/test/x as root (!), /tmp/copy/passwd is owned
by www user (website)
- you would like to remove the file "x": call unlink("/tmp/copy/passwd")
- ... but just before that, an attacker replaces the /tmp/copy
directory with a symlink to /etc
- you will remove /etc/passwd instead of /tmp/copy/passwd, oh oh

Using unlink("passwd", dir_fd=tmp_copy_fd), you don't have this issue.
You are sure that you are working in /tmp/copy directory.

You can imagine a lot of other scenarios to override files and read
sensitive files.

Hopefully, the Linux rm commands knows unlinkat() sycall ;-)

haypo at selma$ mkdir -p a/b/c
haypo at selma$ strace -e unlinkat rm -rf a
unlinkat(5, "c", AT_REMOVEDIR)          = 0
unlinkat(4, "b", AT_REMOVEDIR)          = 0
unlinkat(AT_FDCWD, "a", AT_REMOVEDIR)   = 0
+++ exited with 0 +++

We should implement a similar think in shutil.rmtree().

See also os.fwalk() which is a version of os.walk() providing dir_fd.

> We can always add it later.

I would prefer to discuss that right now. My proposition is to accept
an int for scandir() and copy the int into DirEntry.dir_fd. It's not
that complex :-)

The enhancement of the pathlib module can be done later. By the way, I
know that Antoine Pitrou wanted to implemented file descriptors in
pathlib, but the feature was rejected or at least delayed.

Victor


More information about the Python-Dev mailing list