[OT] strange interaction between open and cwd

Mon May 3 20:16:39 EDT 2010

"Grant Edwards" <invalid at invalid.invalid> wrote in message 
news:hrn3qn$nh0$2 at reader1.panix.com...
> On 2010-05-03, Chris Rebert <clp2 at rebertia.com> wrote:
>
>>> open(path) -> "IOError: [Errno 2] No such file or directory"
>>>
>>> i think that if the first of these seemingly "impossible" requests 
>>> fails, it
>>> is reasonable to expect that the second one also fails. but the second 
>>> one
>>> (sometimes) doesn't.
>>>
>>> i think they should always either both succeed, or both fail.
>>
>> Well, that's Unix and Worse-is-Better[1] for ya. Inelegant
>> theoretically,
>
> I don't see how it's inelegant at all.  Perhaps it's counter-intuitive
> if you don't understand how a Unix filesystem works, but the
> underlying filesystem model is very simple, regular, and elegant.
>
>> but probably makes some bit of the OS's job slightly easier and is
>> usually good enough in practice. Pragmatism is a bitch sometimes. :-)
>

I agree that the Unix file system is quite elegant, but can be 
counter-intuitive
for people who are used to the "one file, one name" paradigm.

Simplifying, if I recall correctly from Unix courses in the early 1980's,
Unix knows a file by the triple (device major number, device minor
number, inode number). Ignoring mount points, directories are simply files
that contain a map from names (components of a path) to inode numbers,
assumed to be on the same device as the directory. There can be many
references to the same inode ("hard" links). The OS keeps track of
how many references there are to a file and deletes files when they no 
longer
have any references.

Opening a file, or using a directory as the current working directory, count 
as
references. Removing a file or directory removes the reference in the file
system, but leaves the in memory references for open files or current 
directories.
Thus, removing a file (or directory) while one or more processes have an 
open
file descriptor referring to it (or have it as their current directory) does 
not
result in it being deleted. The in-memory references keep it alive, but it
is no longer accessible to other processes as it no longer has a name
in the file system.

It is (or used to be) a common unix idiom to create and open a temporary 
file
(or create a temporary directory and cd to it), and then delete it. This 
ensured
that no other processes could then access the file/directory (it no longer 
had a
name in the file system), and that the file (directory) will usually be 
deleted
by the OS when the process exits, as the only reference to the 
file/directory
is the open file descriptor/current directory in the process, which 
disappears
when the process exits. This applies even if the normal end of process
cleanup does not happen because of a (Unix) kill -9 or other major problem,
and usually applies even in case of a system crash.

In the OP's case, references to the directory have been removed from the 
file
system, but his process still has the current working directory reference to 
it,
so it has not actually been deleted. When he opens "../abc.txt", the OS 
searches
the current directory for ".." and finds the inode for /home/baz/tmp, then 
searches
that directory (/home/baz/tmp) for abc.txt and finds it.

Note that, on unix, hard links and possible duplicate NFS mounts make it 
impossible
to guarantee a unique "name" for a file. Even on windows, multiple CIFS 
mounts can
result in in being impossible to guarantee a unique name.

Charles