[Python-Dev] PEP 471: scandir(fd) and pathlib.Path(name, dir_fd=None)

Akira Li 4kir4.1i at gmail.com
Tue Jul 1 17:58:03 CEST 2014


Ben Hoyt <benhoyt at gmail.com> writes:

> Thanks, Victor.
>
> I don't have any experience with dir_fd handling, so unfortunately
> can't really comment here.
>
> What advantages does it bring? I notice that even os.listdir() on
> Python 3.4 doesn't have anything related to file descriptors, so I'd
> be in favour of not including support. We can always add it later.
>
> -Ben

FYI, os.listdir does support file descriptors in Python 3.3+ try:

  >>> import os
  >>> os.listdir(os.open('.', os.O_RDONLY))

NOTE: os.supports_fd and os.supports_dir_fd are different sets.

See also,
https://mail.python.org/pipermail/python-dev/2014-June/135265.html


--
Akira


P.S. Please, don't put your answer on top of the message you are
replying to.

>
> On Tue, Jul 1, 2014 at 3:44 AM, Victor Stinner <victor.stinner at gmail.com> wrote:
>> Hi,
>>
>> IMO we must decide if scandir() must support or not file descriptor.
>> It's an important decision which has an important impact on the API.
>>
>>
>> To support scandir(fd), the minimum is to store dir_fd in DirEntry:
>> dir_fd would be None for scandir(str).
>>
>>
>> scandir(fd) must not close the file descriptor, it should be done by
>> the caller. Handling the lifetime of the file descriptor is a
>> difficult problem, it's better to let the user decide how to handle
>> it.
>>
>> There is the problem of the limit of open file descriptors, usually
>> 1024 but it can be lower. It *can* be an issue for very deep file
>> hierarchy.
>>
>> If we choose to support scandir(fd), it's probably safer to not use
>> scandir(fd) by default in os.walk() (use scandir(str) instead), wait
>> until the feature is well tested, corner cases are well known, etc.
>>
>>
>> The second step is to enhance pathlib.Path to support an optional file
>> descriptor. Path already has methods on filenames like chmod(),
>> exists(), rename(), etc.
>>
>>
>> Example:
>>
>> fd = os.open(path, os.O_DIRECTORY)
>> try:
>>    for entry in os.scandir(fd):
>>       # ... use entry to benefit of entry cache: is_dir(), lstat_result ...
>>       path = pathlib.Path(entry.name, dir_fd=entry.dir_fd)
>>       # ... use path which uses dir_fd ...
>> finally:
>>     os.close(fd)
>>
>> Problem: if the path object is stored somewhere and use after the
>> loop, Path methods will fail because dir_fd was closed. It's even
>> worse if a new directory uses the same file descriptor :-/ (security
>> issue, or at least tricky bugs!)
>>
>> Victor
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/benhoyt%40gmail.com



More information about the Python-Dev mailing list