[OT] strange interaction between open and cwd
Charles
C.Sanders at DeleteThis.Bom.GOV.AU
Mon May 3 20:16:39 EDT 2010
"Grant Edwards" <invalid at invalid.invalid> wrote in message
news:hrn3qn$nh0$2 at reader1.panix.com...
> On 2010-05-03, Chris Rebert <clp2 at rebertia.com> wrote:
>
>>> open(path) -> "IOError: [Errno 2] No such file or directory"
>>>
>>> i think that if the first of these seemingly "impossible" requests
>>> fails, it
>>> is reasonable to expect that the second one also fails. but the second
>>> one
>>> (sometimes) doesn't.
>>>
>>> i think they should always either both succeed, or both fail.
>>
>> Well, that's Unix and Worse-is-Better[1] for ya. Inelegant
>> theoretically,
>
> I don't see how it's inelegant at all. Perhaps it's counter-intuitive
> if you don't understand how a Unix filesystem works, but the
> underlying filesystem model is very simple, regular, and elegant.
>
>> but probably makes some bit of the OS's job slightly easier and is
>> usually good enough in practice. Pragmatism is a bitch sometimes. :-)
>
I agree that the Unix file system is quite elegant, but can be
counter-intuitive
for people who are used to the "one file, one name" paradigm.
Simplifying, if I recall correctly from Unix courses in the early 1980's,
Unix knows a file by the triple (device major number, device minor
number, inode number). Ignoring mount points, directories are simply files
that contain a map from names (components of a path) to inode numbers,
assumed to be on the same device as the directory. There can be many
references to the same inode ("hard" links). The OS keeps track of
how many references there are to a file and deletes files when they no
longer
have any references.
Opening a file, or using a directory as the current working directory, count
as
references. Removing a file or directory removes the reference in the file
system, but leaves the in memory references for open files or current
directories.
Thus, removing a file (or directory) while one or more processes have an
open
file descriptor referring to it (or have it as their current directory) does
not
result in it being deleted. The in-memory references keep it alive, but it
is no longer accessible to other processes as it no longer has a name
in the file system.
It is (or used to be) a common unix idiom to create and open a temporary
file
(or create a temporary directory and cd to it), and then delete it. This
ensured
that no other processes could then access the file/directory (it no longer
had a
name in the file system), and that the file (directory) will usually be
deleted
by the OS when the process exits, as the only reference to the
file/directory
is the open file descriptor/current directory in the process, which
disappears
when the process exits. This applies even if the normal end of process
cleanup does not happen because of a (Unix) kill -9 or other major problem,
and usually applies even in case of a system crash.
In the OP's case, references to the directory have been removed from the
file
system, but his process still has the current working directory reference to
it,
so it has not actually been deleted. When he opens "../abc.txt", the OS
searches
the current directory for ".." and finds the inode for /home/baz/tmp, then
searches
that directory (/home/baz/tmp) for abc.txt and finds it.
Note that, on unix, hard links and possible duplicate NFS mounts make it
impossible
to guarantee a unique "name" for a file. Even on windows, multiple CIFS
mounts can
result in in being impossible to guarantee a unique name.
Charles
More information about the Python-list
mailing list