[Twisted-Python] Re: plus mode was Re: how winnt fileops work and what to do about it
![](https://secure.gravatar.com/avatar/0f15c04b6acde258bd27586371ae94b1.jpg?s=120&d=mm&r=g)
glyph@divmod.com writes:
I'm pretty sure that the real problem we're trying to solve here is caused by a stuck process keeping a .pyd file open. Indeed, if you look at the buildslave's logs, you'll see the exception is as follows: exceptions.OSError: [Errno 13] Permission denied: 'c:\\buildslave\\win32-win32er\\W32-full2.4-win32er\\Twisted\\twisted\\protocols\\_c_urlarg.pyd' So changing the way Twisted or its unit tests open a file is just not going to help. What matters is the way python (or.. pyrex?) opens a file. (for context: the buildbot is currently configured to do SVN checkout/updates into one directory, then copy the tree into a second directory, then run tests on that second directory. This mode='copy' approach uses 'svn update' to minimizes network bandwidth, but at the expense of doubling the disk usage with the extra copy. At the beginning of each build, the buildslave deletes the second directory with a function named rmdirRecursive() that bear provided, which does a chmod() of any mis-permissioned files before deleting them. It was an os.remove() inside this rmdirRecursive which raised the exception). I've run into a similar problem in the past, under Solaris, using NFS, where a test case spawned off a daemon process which then didn't die when it was supposed to, somehow held on to a file (I think solaris won't let you delete a file that is being used as the backing store for an executable), and that prevented the unlink() from succeeding. In that environment, I just renamed the top-level directory to something unique, spawned off an 'rm -rf' into the background to delete the old directory if it was possible, then continued on with the next build. If the code had to try too hard to come up with a unique name, it would flag a warning that there might be a stuck process somewhere. Perhaps we could use something similar here? Of course, the real fix would be to find a way to let the testing code kill off any stuck processes, but that'll probably be very windows-specific. cheers, -Brian
![](https://secure.gravatar.com/avatar/7ed9784cbb1ba1ef75454034b3a8e6a1.jpg?s=120&d=mm&r=g)
On Sat, 31 Dec 2005 14:16:39 -0800 (PST), Brian Warner <warner@lothar.com> wrote:
To clarify, there are many, many problems we are attempting to solve >:) Confusingly, they seem to be predominantly filesystem related.
exceptions.OSError: [Errno 13] Permission denied: 'c:\\buildslave\\win32-win32er\\W32-full2.4-win32er\\Twisted\\twisted\\protocols\\_c_urlarg.pyd'
This is definitely one of them. Another is that trial's test_output and test_runner try to move a directory aside and fail for some reason. Another is that some tests assert things about the behavior of files opened in 'r+b' mode, which does not behave the same way on Win32 as on POSIX. Jean-Paul
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Jean-Paul Calderone" <exarkun@divmod.com> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Saturday, December 31, 2005 5:27 PM Subject: Re: [Twisted-Python] Re: plus mode was Re: how winnt fileops workand what to do about it
This is definitely one of them. Another is that trial's test_output and test_runner try to move a directory aside and fail for some > reason.
they fail for the same reason it looks like. the crux of the issue is that you can't hold files open if you're going to be performing ops on an ancestor directory.
this has been resolved (as in cause found and confirmed). also, if you will excuse me for being pedantically retentive, there is no direct equivalent to ansi c file stream apis in either win32 or nt native. in fact, neither is there one on posix. nt/posix is syscalls, win32 is a libc-like layer on top of ntapi, but completely different. this weirdness is all in the libc. the issue lies, specifically, with the libc (aka c runtime as they call it) visual studio provides and its implementation of file streams. i'm willing to bet that if cpython can be built using something other than visual studio on windows, those builds do not suffer from the same issue (unless that product's authors decided to lemming microsoft when writing their libc). -p
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Brian Warner" <warner@lothar.com> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Saturday, December 31, 2005 5:16 PM Subject: [Twisted-Python] Re: plus mode was Re: how winnt fileops work and what to do about it
sysinternals.com should have a utility equivalent to lsof. this is probably the best way to figure out who's doing this.
this has to do with how execution works in unices generally. it is *not* a lock - there are no compulsory locks - so while the situation is somewhat (not very, though) similar wrt effects, it's actually completely different. posix semantics dictate that you can not open a file being executed for writing and can not execute if it's open for writing; you can, however, unlink because the inode doesn't get reaped until the refcount drops to 0. this is the case on linux systems. svr4 prohibits the unlink as well, this is an svr4 extension to posix. as an interesting piece of trivia to chuckle about, the errno for these conditions is ETXTBUSY aka Textfile Busy. (this is funny because executables are always binary in practice).
this is a valid technique, except when you're dealing with windows ;) as i mentioned in another post, renames (regardless of how high up in the tree you go) are recursive copy + recursive delete. the delete will fail. furthermore, SHFileOperation recursive deletes bail on first error, afair.
Perhaps we could use something similar here?
no, see above.
on windows, we probably want to use os.abort() and on *nix os.kill(). however, it is probably more interesting to figure out why processes are getting stuck ;) -p
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Brian Warner" <warner@lothar.com> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Saturday, December 31, 2005 5:16 PM Subject: [Twisted-Python] Re: plus mode was Re: how winnt fileops work and what to do about it
on *nix, we can use os.waitpid() and os.kill(). on windows, we can use win32 api OpenProcess+WaitFor{Single,Multiple}Object[s]() and TerminateProcess. please keep in mind that killing processes on windows is not safe when they use dlls..on *nix, this is of course protected against with proper signal handling. -p
![](https://secure.gravatar.com/avatar/7ed9784cbb1ba1ef75454034b3a8e6a1.jpg?s=120&d=mm&r=g)
On Sat, 31 Dec 2005 14:16:39 -0800 (PST), Brian Warner <warner@lothar.com> wrote:
To clarify, there are many, many problems we are attempting to solve >:) Confusingly, they seem to be predominantly filesystem related.
exceptions.OSError: [Errno 13] Permission denied: 'c:\\buildslave\\win32-win32er\\W32-full2.4-win32er\\Twisted\\twisted\\protocols\\_c_urlarg.pyd'
This is definitely one of them. Another is that trial's test_output and test_runner try to move a directory aside and fail for some reason. Another is that some tests assert things about the behavior of files opened in 'r+b' mode, which does not behave the same way on Win32 as on POSIX. Jean-Paul
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Jean-Paul Calderone" <exarkun@divmod.com> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Saturday, December 31, 2005 5:27 PM Subject: Re: [Twisted-Python] Re: plus mode was Re: how winnt fileops workand what to do about it
This is definitely one of them. Another is that trial's test_output and test_runner try to move a directory aside and fail for some > reason.
they fail for the same reason it looks like. the crux of the issue is that you can't hold files open if you're going to be performing ops on an ancestor directory.
this has been resolved (as in cause found and confirmed). also, if you will excuse me for being pedantically retentive, there is no direct equivalent to ansi c file stream apis in either win32 or nt native. in fact, neither is there one on posix. nt/posix is syscalls, win32 is a libc-like layer on top of ntapi, but completely different. this weirdness is all in the libc. the issue lies, specifically, with the libc (aka c runtime as they call it) visual studio provides and its implementation of file streams. i'm willing to bet that if cpython can be built using something other than visual studio on windows, those builds do not suffer from the same issue (unless that product's authors decided to lemming microsoft when writing their libc). -p
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Brian Warner" <warner@lothar.com> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Saturday, December 31, 2005 5:16 PM Subject: [Twisted-Python] Re: plus mode was Re: how winnt fileops work and what to do about it
sysinternals.com should have a utility equivalent to lsof. this is probably the best way to figure out who's doing this.
this has to do with how execution works in unices generally. it is *not* a lock - there are no compulsory locks - so while the situation is somewhat (not very, though) similar wrt effects, it's actually completely different. posix semantics dictate that you can not open a file being executed for writing and can not execute if it's open for writing; you can, however, unlink because the inode doesn't get reaped until the refcount drops to 0. this is the case on linux systems. svr4 prohibits the unlink as well, this is an svr4 extension to posix. as an interesting piece of trivia to chuckle about, the errno for these conditions is ETXTBUSY aka Textfile Busy. (this is funny because executables are always binary in practice).
this is a valid technique, except when you're dealing with windows ;) as i mentioned in another post, renames (regardless of how high up in the tree you go) are recursive copy + recursive delete. the delete will fail. furthermore, SHFileOperation recursive deletes bail on first error, afair.
Perhaps we could use something similar here?
no, see above.
on windows, we probably want to use os.abort() and on *nix os.kill(). however, it is probably more interesting to figure out why processes are getting stuck ;) -p
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Brian Warner" <warner@lothar.com> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Saturday, December 31, 2005 5:16 PM Subject: [Twisted-Python] Re: plus mode was Re: how winnt fileops work and what to do about it
on *nix, we can use os.waitpid() and os.kill(). on windows, we can use win32 api OpenProcess+WaitFor{Single,Multiple}Object[s]() and TerminateProcess. please keep in mind that killing processes on windows is not safe when they use dlls..on *nix, this is of course protected against with proper signal handling. -p
participants (3)
-
Brian Warner
-
Jean-Paul Calderone
-
Paul G