Re: [Python-Dev] [Python-checkins] cpython (2.7): Fix closes issue10761: tarfile.extractall failure when symlinked files are

Hi,
I’m still educating myself about concurrency and race conditions, so I hope my naïve question won’t be just a waste of time. Here it is:
http://hg.python.org/cpython/rev/0c8bc3a0130a user: Senthil Kumaran orsenthil@gmail.com summary: Fix closes issue10761: tarfile.extractall failure when symlinked files are present.
diff --git a/Lib/tarfile.py b/Lib/tarfile.py --- a/Lib/tarfile.py +++ b/Lib/tarfile.py @@ -2239,6 +2239,8 @@ if hasattr(os, "symlink") and hasattr(os, "link"): # For systems that support symbolic and hard links. if tarinfo.issym():
if os.path.exists(targetpath):
os.unlink(targetpath)
Is there a race condition here?
Thanks Regards

On Thu, Apr 28, 2011 at 04:20:06PM +0200, Éric Araujo wrote:
if hasattr(os, "symlink") and hasattr(os, "link"): # For systems that support symbolic and hard links. if tarinfo.issym():
if os.path.exists(targetpath):
os.unlink(targetpath)
Is there a race condition here?
The lock to avoid race conditions (if you were thinking along those lines) would usually be implemented at the higher level code which is using extractall in threads.
Checking that no one else is accessing the file before unlinking may not be suitable for the library method and of course, we cannot check if someone is waiting to act on that file.

On Thu, 28 Apr 2011 22:44:50 +0800 Senthil Kumaran orsenthil@gmail.com wrote:
On Thu, Apr 28, 2011 at 04:20:06PM +0200, Éric Araujo wrote:
if hasattr(os, "symlink") and hasattr(os, "link"): # For systems that support symbolic and hard links. if tarinfo.issym():
if os.path.exists(targetpath):
os.unlink(targetpath)
Is there a race condition here?
The lock to avoid race conditions (if you were thinking along those lines) would usually be implemented at the higher level code which is using extractall in threads.
A lock would only protect only against multi-threaded use of the tarfile module, which is probably quite rare and therefore not a real concern. The kind of race condition which can happen here is if an attacker creates "targetpath" between os.path.exists and os.unlink. Whether it is an exploitable flaw would need a detailed analysis, of course.
Regards
Antoine.

On Thu, Apr 28, 2011 at 04:20:06PM +0200, Éric Araujo wrote:
if hasattr(os, "symlink") and hasattr(os, "link"): # For systems that support symbolic and hard links. if tarinfo.issym():
- if os.path.exists(targetpath):
- os.unlink(targetpath)
Is there a race condition here?
The lock to avoid race conditions (if you were thinking along those lines) would usually be implemented at the higher level code which is using extractall in threads.
A lock would only protect only against multi-threaded use of the tarfile module, which is probably quite rare and therefore not a real concern. The kind of race condition which can happen here is if an attacker creates "targetpath" between os.path.exists and os.unlink. Whether it is an exploitable flaw would need a detailed analysis, of course.
Just out of curiosity, could you please elaborate on the potential threat of this? If the "exists" condition is true, targetpath already exists, so what use there is in overwriting it? If the condition is false, unlink isn't executed, so no harm either. What am I missing?
Eli

On Fri, Apr 29, 2011 at 4:26 PM, Eli Bendersky eliben@gmail.com wrote:
On Thu, Apr 28, 2011 at 04:20:06PM +0200, Éric Araujo wrote:
The kind of race condition which can happen here is if an attacker creates "targetpath" between os.path.exists and os.unlink. Whether it is an exploitable flaw would need a detailed analysis, of course.
Just out of curiosity, could you please elaborate on the potential threat of this? If the "exists" condition is true, targetpath already exists, so what use there is in overwriting it? If the condition is false, unlink isn't executed, so no harm either. What am I missing?
That's the "detailed analysis" part. What happens if other code deletes the path, and the unlink() call subsequently fails despite the successful exists() check? Hence why exception checking (as Nadeem posted) is typically the only right way to do things that access an external environment that supports multiple concurrent processes.
For this kind of case, denial-of-service (i.e. an externally induced program crash) is likely to be the limit of the damage, so the threat isn't severe. Still worth avoiding the risk, though.
Really tricky cases can lead to all sorts of fun and games, like manipulating programs that were granted elevated privileges into executing malicious code that was put in place using only user privileges (combining "sudo" and its ilk with "python" without passing -E and -s is an unfortunately-less-than-tricky way sysadmins can shoot themselves in the foot on that front).
Cheers, Nick.

On Fri, Apr 29, 2011 at 09:52, Nick Coghlan ncoghlan@gmail.com wrote:
On Fri, Apr 29, 2011 at 4:26 PM, Eli Bendersky eliben@gmail.com wrote:
On Thu, Apr 28, 2011 at 04:20:06PM +0200, Éric Araujo wrote:
The kind of race condition which can happen here is if an attacker creates "targetpath" between os.path.exists and os.unlink. Whether it is an exploitable flaw would need a detailed analysis, of course.
Just out of curiosity, could you please elaborate on the potential threat of this? If the "exists" condition is true, targetpath already exists, so what use there is in overwriting it? If the condition is false, unlink isn't executed, so no harm either. What am I missing?
That's the "detailed analysis" part. What happens if other code deletes the path, and the unlink() call subsequently fails despite the successful exists() check? Hence why exception checking (as Nadeem posted) is typically the only right way to do things that access an external environment that supports multiple concurrent processes.
I completely understand this "other code/thread deletes the path between exists() and unlink()" case - it indeed is a race condition waiting to happen. What I didn't understand was Antoine's example of "attacker creates targetpath between os.path.exists and os.unlink", and was asking for a more detailed example, since I'm not really experienced with security-oriented thinking.
Eli

On Fri, Apr 29, 2011 at 10:02 AM, Eli Bendersky eliben@gmail.com wrote:
I completely understand this "other code/thread deletes the path between exists() and unlink()" case - it indeed is a race condition waiting to happen. What I didn't understand was Antoine's example of "attacker creates targetpath between os.path.exists and os.unlink", and was asking for a more detailed example, since I'm not really experienced with security-oriented thinking.
If targetpath is created after the os.path.exists() check, then os.unlink() will not be called, so os.symlink() will raise an exception when it sees that targetpath already exists.
On Thu, Apr 28, 2011 at 5:44 PM, Antoine Pitrou solipsis@pitrou.net wrote:
On Thu, 28 Apr 2011 17:40:05 +0200 Nadeem Vawda nadeem.vawda@gmail.com wrote:
The deletion case could be handled like this:
if tarinfo.issym():
try:
os.unlink(targetpath)
except OSError as e:
if e.errno != errno.ENOENT:
raise os.symlink(tarinfo.linkname, targetpath)
Someone can still create "targetpath" between the calls to unlink() and symlink() though.
Like I said, that code only handles the case of targetpath being deleted. I can't think of a similarly easy fix for the creation case. You could solve that particular form of the problem with something like:
if tarinfo.issym(): while True: try: os.symlink(tarinfo.linkname, targetpath) except OSError as e: if e.errno != errno.EEXIST: raise else: break try: os.unlink(targetpath) except OSError as e: if e.errno != errno.ENOENT: raise
... but that would open up another DOS vulnerability - if an attacker manages to keep re-creating targetpath in the window between unlink() and symlink(), the loop would never terminate. Also, the code is rather ugly ;-)
Cheers, Nadeem

On Thu, Apr 28, 2011 at 4:44 PM, Senthil Kumaran orsenthil@gmail.com wrote:
On Thu, Apr 28, 2011 at 04:20:06PM +0200, Éric Araujo wrote:
if hasattr(os, "symlink") and hasattr(os, "link"): # For systems that support symbolic and hard links. if tarinfo.issym():
if os.path.exists(targetpath):
os.unlink(targetpath)
Is there a race condition here?
The lock to avoid race conditions (if you were thinking along those lines) would usually be implemented at the higher level code which is using extractall in threads.
Checking that no one else is accessing the file before unlinking may not be suitable for the library method and of course, we cannot check if someone is waiting to act on that file.
I think Éric is referring to the possibility of another process creating or deleting targetpath between the calls to os.path.exists() and os.unlink(). This would result in symlink() or unlink() raising an exception.
The deletion case could be handled like this:
if tarinfo.issym(): + try: + os.unlink(targetpath) + except OSError as e: + if e.errno != errno.ENOENT: + raise os.symlink(tarinfo.linkname, targetpath)
I'm not sure what the best way of handling the creation case is. The obvious solution would be to try the above code in a loop, repeating until we succeed (or fail for a different reason), but this would not be guaranteed to terminate.
Cheers, Nadeem
participants (6)
-
Antoine Pitrou
-
Eli Bendersky
-
Nadeem Vawda
-
Nick Coghlan
-
Senthil Kumaran
-
Éric Araujo