[Python-Dev] Remaining decisions on PEP 471 -- os.scandir()

Akira Li 4kir4.1i at gmail.com
Mon Jul 14 07:51:24 CEST 2014


Nick Coghlan <ncoghlan at gmail.com> writes:

> On 13 Jul 2014 20:54, "Tim Delaney" <timothy.c.delaney at gmail.com> wrote:
>>
>> On 14 July 2014 10:33, Ben Hoyt <benhoyt at gmail.com> wrote:
>>>
>>>
>>>
>>> If we go with Victor's link-following .is_dir() and .is_file(), then
>>> we probably need to add his suggestion of a follow_symlinks=False
>>> parameter (defaults to True). Either that or you have to say
>>> "stat.S_ISDIR(entry.lstat().st_mode)" instead, which is a little bit
>>> less nice.
>>
>>
>> Absolutely agreed that follow_symlinks is the way to go, disagree on the
> default value.
>>
>>>
>>> Given the above arguments for symlink-following is_dir()/is_file()
>>> methods (have I missed any, Victor?), what do others think?
>>
>>
>> I would say whichever way you go, someone will assume the opposite. IMO
> not following symlinks by default is safer. If you follow symlinks by
> default then everyone has the following issues:
>>
>> 1. Crossing filesystems (including onto network filesystems);
>>
>> 2. Recursive directory structures (symlink to a parent directory);
>>
>> 3. Symlinks to non-existent files/directories;
>>
>> 4. Symlink to an absolutely huge directory somewhere else (very annoying
> if you just wanted to do a directory sizer ...).
>>
>> If follow_symlinks=False by default, only those who opt-in have to deal
> with the above.
>
> Or the ever popular symlink to "." (or a directory higher in the tree).
>
> I think os.walk() is a good source of inspiration here: call the flag
> "followlink" and default it to False.
>

Let's not multiply entities beyond necessity.

There is well-defined *follow_symlinks* parameter
https://docs.python.org/3/library/os.html#follow-symlinks
e.g., os.access, os.chown, os.link, os.stat, os.utime and many other
functions in os module support follow_symlinks parameter, see
os.supports_follow_symlinks.

os.walk is an exception that uses *followlinks*. It might be because it
is an old function e.g., newer os.fwalk uses follow_symlinks.

------------------------------------------------------------

As it has been said: os.path.isdir, pathlib.Path.is_dir in Python
File.directory? in Ruby, System.Directory.doesDirectoryExist in Haskell,
`test -d` in shell do follow symlinks i.e., follow_symlinks=True as
default is more familiar for .is_dir method.

`cd path` in shell, os.chdir(path), `ls path`, os.listdir(path), and
os.scandir(path) itself follow symlinks (even on Windows:
http://bugs.python.org/issue13772 ). GUI file managers such as
`nautilus` also treat symlinks to directories as directories -- you may
click on them to open corresponding directories.

Only *recursive* functions such as os.walk, os.fwalk do not follow
symlinks by default, to avoid symlink loops. Note: the behavior is
consistent with coreutils commands such as `cp` that follows symlinks
for non-recursive actions but e.g., `du` utility that is inherently
recursive doesn't follow symlinks by default.

follow_symlinks=True as default for DirEntry.is_dir method allows to
avoid easy-to-introduce bugs while replacing old
os.listdir/os.path.isdir code or writing a new code using the same
mental model.


--
Akira



More information about the Python-Dev mailing list