[Python-ideas] shutil.symlink to allow non-race replacement of existing link targets
Steven D'Aprano
steve at pearwood.info
Mon May 13 13:38:03 EDT 2019
On Mon, May 13, 2019 at 12:31:08PM +0200, Anders Hovmöller wrote:
>
>
> > On 13 May 2019, at 11:38, Tom Hale <tom at hale.ee> wrote:
> >
> > * os.symlink() - the new link name is expected to NOT exist
> > * shutil.symlink() - the new symlink replaces an existing file
>
> This seems error prone to me. There is nothing about those names that
> hint at the difference in behavior.
One of them is in the `os` module, and therefore we should expect that
it will be a thin wrapper around the OS functionality.
The other is in the `shutil` module, and therefore we should expect that
it will be a shell utility function of arbitrary complexity.
Being found in different modules should always be treated as a sign that
the functions may be different. (If they're the same, why do we need
both?)
> An optional "overwrite_if_exists=False" flag seems much nicer.
Aside from the argument name being too verbose, that violates the rule
of thumb "avoid constant bool flags" design principle.
(Note that this is a *rule of thumb*, not a hard law: there are times
that it can and should be violated; its also not a well-known principle
and so lots of APIs violate it even when they shouldn't.)
If you have a function which takes a boolean argument to swap between
two different modes, and the usual calling pattern is to pass a
constant:
function(arg, x, y, flag=True)
rather than a flag which is not know until runtime, then it is (usually)
better to use two distinct functions rather than one function with a
parameter that swaps modes.
Another way to put it: in general, if you have two modes, you should
have two functions.
For example:
- str.find and rfind, not str.find with a from_right parameter;
- bisect.insort_left and insort_right, not bisect.insort with a
from_left parameter;
- statistics.stdev and pstdev rather than population parameter;
- os.setuid and setgid rather than a change_group parameter;
- zip and itertools.zip_longest, not zip with longest parameter.
This principle doesn't apply when the flag is typically not known until
runtime. For example, this is unfortunate:
if get_some_flag(*args):
result = str.find(spam)
else:
result = str.rfind(spam)
but rare. If it were very common, then it would be justified to provide
a single API with a mode flag switching behaviours.
The idea of "no constant bool arguments" is that if you typically
know which mode you want at edit-time (that includes runtime in
interactive environments), the two modes should be distinguished by
function name rather than by a parameter.
I expect that, typically, users of this will know ahead of time whether
they want to overwrite symlinks or not.
--
Steven
More information about the Python-ideas
mailing list