[Python-ideas] Implement POSIX ln via shutil.link and shutil.symlink

Tom Hale tom at hale.ee
Sun Jun 2 09:08:33 EDT 2019


On 1/6/19 2:06 pm, Serhiy Storchaka wrote:
> 30.05.19 00:22, Barry пише:
>> Serhiy, I think, is conflating two things.
>> 1. How to write software robust aginst attack.
>> 2. How to replace a symlink atomically.
> 
> Why do you need to replace a symlink atomically? This is a solution, 
> what problem it solves?

Note that the issue is not limited to symlinks, the analogous problem 
occurs with hard links.

Atomicity is generally considered to be a Good Thing. Atomicity 
eliminates race conditions.

Race conditions are sometimes difficult to enumerate, but here are three 
  I can think of which would not occur given an atomic replace:

1) Unnecessarily Inconsistent Post-conditions
2) Where the link must always exist
3) Unhandled exception attacks

1) Unnecessarily Inconsistent Post-conditions
==============================================

Atomicity ensures that an operation has a consistent post-condition 
where that is possible. Greg did a great job of explaining this:

On 1/6/19 2:29 pm, Greg Ewing wrote:
 > Process A wants to symlink f1 --> f2, replacing any existing f1.
 >
 > Process B wants to create f1 if it doesn't already exist, or update
 > it if it does. If f1 is a symlink, the file it's linked to should
 > be updated.
 >
 > The end result should be that f1 exists and is a symlink to f2.
 >
 > If the symlink is not atomic, this can happen:
 >
 > 1. Process A sees that f1 already exists and deletes it.
 > 2. Process B sees that f1 does not exist and creates a new file
 >     called f1.
 > 3. Process A tries to symlink f1 to f2, which fails because there
 >     is now an existing file called f1.
 >
 > This violates the postcondition, because f1 is not a symlink
 > to f2.


2) Where the link must always exist
====================================

This is an example of atomicity ensuring that a condition remains valid 
*during* the operation.

I gave an example of this in my initial post:

On 13/5/19 4:38 pm, Tom Hale wrote:
 > It would be tempting to do:
 >
 > while True:
 >      try:
 >          os.symlink(target, link_name)
 >          break
 >      except FileExistsError:
 >          os.remove(link_name)
 >
 > But this has a race condition when replacing a symlink should should
 > *always* exist, eg:
 >
 >      /lib/critical.so -> /lib/critical.so.1.2
 >
 > When upgrading by:
 >
 >      symlink('/lib/critical.so.2.0', '/lib/critical.so')
 >
 > There is a point in time when /lib/critical.so doesn't exist.

The way I know to ensure that the well-known symlink exists at all times 
is to replace it.


3) Unhandled exception attacks
===============================

Most people replacing a link or symlink will naively just unlink the 
existing link if it exists, then create the replacement. Most people 
won't think that an exception could occur if somehow the destination is 
recreated after unlink and before link.

Possible DoS:

Someone who doesn't have permissions to kill a process, but has write 
access to a link or symlink could infinite-loop trying to create a new 
file at the location of the (sym)link. While this would be high CPU 
load, if there was a signal that a link replacement was imminent (eg the 
creation of a lockfile) this would be possible to hide. Alternatively, 
if the link updates were known to occur within a very small time window 
(eg cron job), this attack also could be feasible.


Other objections
=================

On 16/5/19 5:05 pm, Serhiy Storchaka wrote:
 >
 > Somebody can replace tmp_symlink between os.symlink() and os.rename().

I raised this in my initial post:

On 13/5/19 4:38 pm, Tom Hale wrote:
 > One issue I see with my suggested code is that the file at
 > temp_link_name could be changed before target is replaced with it.
 > This is mitigated by the randomness introduced by mktemp().
 >
 > While it is far less likely that a file is accessed with a random and
 > unknown name than with an existing known name, I seek input on a
 > solution if this is an unacceptable risk.

My solution reduces risk greatly. I am still open to suggestions to 
totally eliminate it.


Wrap-up
========

It's easier to program when one doesn't constantly need to be aware of 
edge and corner cases, instead having them handled by the standard 
library.  POSIX dictates a certain behaviour for link() and symlink(), 
but we have the opportunity to make life easier for programmers via shutil.

Atomicity is a venerable problem in computing science. We have a 
solution which massively reduces the risk of race conditions.


The larger picture
===================

Atomicity is only ONE point proposed in this post:

https://code.activestate.com/lists/python-ideas/56054/

The linked posts steps back and proposes to support POSIX's ln 
interface: ie, allowing multiple links to be created in a directory with 
a single invocation. Atomicity is optional in this discussion, but so 
far, there has been no discussion.

In that post I also propose documentation updates (also no response).


Concerning the bug report
==========================
Please note that I have changed the bug:
https://bugs.python.org/issue36656

To no longer refer be classed as "security".  If I could change it to 
reference shutil rather than os, I would.

-- 

Regards,

Tom Hale


More information about the Python-ideas mailing list