Special file "nul" in Windows and os.stat
Hi, people! I'm following the issue 1311: http://bugs.python.org/issue1311 There (and always talking in windows), the OP says the in Py2.4 os.path.exists("nul") returned True and now in 2.5 returns False. Note that "nul" is an special file, something like /dev/null. We made some tests, and we have inconsisten behaviour in previous Python versions. For example, in Py2.3.5 in my machine I get a False, as in Py2.5. But other person in the bug, gets True in 2.3.3 and False in 2.5. Even the OP has differents result for the same Python 2.4 in different machines. Right now (but don't know exactly since when), Python relies in kernel32.dll functions to make the stat on the file (if stat raises an error, os.path.exists says that the file does not exist). Of course, if I call to this function separately, I have the same behaviour. So, the question is what we should do?: 1. Rely on the kernel32 function and behaves like it says? 2. Return a fixed response for this special file "nul"? Personally, I prefer the first one, but it changed the semantic of os.path.exists("nul") (but this semantic is not clear, as we get different behaviour in different Python versions and windows versions). Thank you very much! Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
On Oct 24, 2007, at 4:23 PM, Facundo Batista wrote:
There (and always talking in windows), the OP says the in Py2.4 os.path.exists("nul") returned True and now in 2.5 returns False. Note that "nul" is an special file, something like /dev/null.
It's special, but in a different way. /dev/null really exists in the Unix filesystem; "nul" is more magical than that. What's more, it has peers: "prn", "com1" and others like that. I don't know what the right way to handle these is (I'm no Windows guru, or even regular user), but it's important to realize that the pain of the specialness isn't limited. :-) -Fred -- Fred Drake <fdrake at acm.org>
Fred Drake wrote:
On Oct 24, 2007, at 4:23 PM, Facundo Batista wrote:
There (and always talking in windows), the OP says the in Py2.4 os.path.exists("nul") returned True and now in 2.5 returns False. Note that "nul" is an special file, something like /dev/null.
It's special, but in a different way. /dev/null really exists in the Unix filesystem; "nul" is more magical than that.
What's more, it has peers: "prn", "com1" and others like that.
It's even worse than that, because file extensions are ignored in this magical-ness: C:\Documents and Settings\User>type nul C:\Documents and Settings\User>type nul.lst C:\Documents and Settings\User>type foo.lst The system cannot find the file specified.
I don't know what the right way to handle these is (I'm no Windows guru, or even regular user), but it's important to realize that the pain of the specialness isn't limited. :-)
http://www.microsoft.com/technet/prodtechnol/Windows2000Pro/reskit/part3/pro... gives the list as CON, AUX, COM1, COM2, COM3, COM4, LPT1, LPT2, LPT3, PRN, NUL; but I can't imagine testing against that list would be the best idea. For example, http://www.microsoft.com/technet/solutionaccelerators/cits/interopmigration/... adds CLOCK$, among others (although I don't find CLOCK$ to be special, it's rumored to be an NT only thing, and I'm running XP). So I think implementing Facundo's option 2 (test for "nul") will not work in the general case for finding "special files" (don't forget to throw in mixed case names). I hate to think of trying to match Windows' behavior if there are multiple dots in the name. I think I'd leave the current behavior of calling the kernel function, even though it varies based on Windows version (if I'm reading the issue correctly). Eric.
Fred Drake wrote:
It's special, but in a different way. /dev/null really exists in the Unix filesystem; "nul" is more magical than that.
What's more, it has peers: "prn", "com1" and others like that.
For the record, the fixed names 'aux', 'con', 'nul', and 'prn' along with the set of 'com[0-9]' and 'lpt[0-9]' names that are reserved. And for that matter, any of those with an extension is reserved as well. These files always exist as far as I am concerned (where existence is defined by your ability to open() them). def is_special_on_win32(name): import os.path, re name = os.path.basename(name) return (re.match('(nul|prn|aux|con|com[0-9]|lpt[0-9])(\..*)?$', name) is not None) -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
So, the question is what we should do?:
1. Rely on the kernel32 function and behaves like it says?
2. Return a fixed response for this special file "nul"?
Personally, I prefer the first one, but it changed the semantic of os.path.exists("nul") (but this semantic is not clear, as we get different behaviour in different Python versions and windows versions).
Note that the same issue would exist for 'aux', 'con' and 'prn' too - 'comXX' 'lptXX' 'clock$' also seem to get special treatment. I agree it is unfortunate that the behaviour has changed, but these special names are broken enough on Windows that (1) seems the sanest thing to do. Mark
So, the question is what we should do?:
Before this question can be answered, I think we need to fully understand what precisely is happening in 2.4, and what precisely is happening in 2.5. AFAICT, it is *not* the case that Python 2.4 (indirectly) has hard-coded the names CON, PRN, NUL etc. in the C library. Instead, Python 2.4 *also* relies on kernel32 functions to determine that these files "exist". My question now is what specific kernel32 functions Python 2.4 calls to determine that NUL is a file; before that question is sufficiently answered, I don't think any action should be taken. Regards, Martin
On Oct 24, 2007 11:05 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
So, the question is what we should do?:
Before this question can be answered, I think we need to fully understand what precisely is happening in 2.4, and what precisely is happening in 2.5.
AFAICT, it is *not* the case that Python 2.4 (indirectly) has hard-coded the names CON, PRN, NUL etc. in the C library. Instead, Python 2.4 *also* relies on kernel32 functions to determine that these files "exist".
My question now is what specific kernel32 functions Python 2.4 calls to determine that NUL is a file; before that question is sufficiently answered, I don't think any action should be taken.
os.path.exist() in win32 just calls os.stat() and decides it doesn't exist if an error is returned. os.stat() uses the vcrt stat()in 2.4, but 2.5 implements it directly in terms of win32 api to deal with limitations in the vcrt implementation. The hand-rolled stat uses GetFileAttributesEx, which returns 0 for the special filenames, with an error code of "The parameter is incorrect" (87), which is why os.path.exists() claims it doesn't exist. Interestingly, plain old GetFileAttributes() works, and returns FILE_ATTRIBUTE_ARCHIVE for them.
Chris Mellon schrieb:
On Oct 24, 2007 11:05 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
So, the question is what we should do?:
Before this question can be answered, I think we need to fully understand what precisely is happening in 2.4, and what precisely is happening in 2.5.
AFAICT, it is *not* the case that Python 2.4 (indirectly) has hard-coded the names CON, PRN, NUL etc. in the C library. Instead, Python 2.4 *also* relies on kernel32 functions to determine that these files "exist".
My question now is what specific kernel32 functions Python 2.4 calls to determine that NUL is a file; before that question is sufficiently answered, I don't think any action should be taken.
os.path.exist() in win32 just calls os.stat() and decides it doesn't exist if an error is returned. os.stat() uses the vcrt stat()in 2.4, but 2.5 implements it directly in terms of win32 api to deal with limitations in the vcrt implementation.
The hand-rolled stat uses GetFileAttributesEx, which returns 0 for the special filenames, with an error code of "The parameter is incorrect" (87), which is why os.path.exists() claims it doesn't exist.
Interestingly, plain old GetFileAttributes() works, and returns FILE_ATTRIBUTE_ARCHIVE for them.
See also a recent blog entry of Raymond Chen at http://blogs.msdn.com/oldnewthing/archive/2007/10/23/5612082.aspx Thomas
My question now is what specific kernel32 functions Python 2.4 calls to determine that NUL is a file; before that question is sufficiently answered, I don't think any action should be taken.
os.path.exist() in win32 just calls os.stat() and decides it doesn't exist if an error is returned. os.stat() uses the vcrt stat()in 2.4, but 2.5 implements it directly in terms of win32 api to deal with limitations in the vcrt implementation.
That doesn't really answer the question, though - you merely state that Python 2.4 calls the CRT, but then my question is still what kernel32 functions are called to have stat on NUL succeed.
Interestingly, plain old GetFileAttributes() works, and returns FILE_ATTRIBUTE_ARCHIVE for them.
What about the other attributes (like modification time, size, etc)? Regards, Martin
On Oct 30, 2007 4:10 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
My question now is what specific kernel32 functions Python 2.4 calls to determine that NUL is a file; before that question is sufficiently answered, I don't think any action should be taken.
os.path.exist() in win32 just calls os.stat() and decides it doesn't exist if an error is returned. os.stat() uses the vcrt stat()in 2.4, but 2.5 implements it directly in terms of win32 api to deal with limitations in the vcrt implementation.
That doesn't really answer the question, though - you merely state that Python 2.4 calls the CRT, but then my question is still what kernel32 functions are called to have stat on NUL succeed.
I'm not 100% (it calls it through a function pointer and I'm not sure I tracked it down correctly), but I think it calls it through the C stat() function. In other words, it doesn't use any kernel32 functions directly, it calls the stat() that's exported from the MSVCRT.
Interestingly, plain old GetFileAttributes() works, and returns FILE_ATTRIBUTE_ARCHIVE for them.
What about the other attributes (like modification time, size, etc)?
GetFileAttributes() doesn't return those, just the FAT filesystem attributes. GetFileSize and GetFileTime fail.
Regards, Martin
That doesn't really answer the question, though - you merely state that Python 2.4 calls the CRT, but then my question is still what kernel32 functions are called to have stat on NUL succeed.
I'm not 100% (it calls it through a function pointer and I'm not sure I tracked it down correctly), but I think it calls it through the C stat() function. In other words, it doesn't use any kernel32 functions directly, it calls the stat() that's exported from the MSVCRT.
Sure - but what does stat then do when passed NUL?
GetFileAttributes() doesn't return those, just the FAT filesystem attributes. GetFileSize and GetFileTime fail.
Ok, so how does msvcrt stat() manage to fill these fields if those functions fail? Regards, Martin
2007/11/3, "Martin v. Löwis" <martin@v.loewis.de>:
GetFileAttributes() doesn't return those, just the FAT filesystem attributes. GetFileSize and GetFileTime fail.
Ok, so how does msvcrt stat() manage to fill these fields if those functions fail?
Beyond the question to this specific question, I do not like the inconsistency of windows with itself during time and versions. As Mask Hammond said, I think that we should rely on what windows is saying to us as strict as possible. If windows change its behaviour, ok, I do not think that we need to "patch" these behaviour holes. What do you think? Is a mistake to adhere to windows behaviour? Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
As Mask Hammond said, I think that we should rely on what windows is saying to us as strict as possible. If windows change its behaviour, ok, I do not think that we need to "patch" these behaviour holes.
What do you think? Is a mistake to adhere to windows behaviour?
We certainly should rely on the Windows behavior. The next question then is: What exactly *is* "the Windows behavior". Windows is not just inconsistent across versions, but apparently so even within a single version. IIUC, GetFileAttributes and FindFirstFile both claim that NUL exists, whereas GetFileAttributesEx claims that it doesn't exist, all in a single version, and all is Windows API. Please understand that Python 2.4 *also* adheres to Windows behavior. Regards, Martin
2007/11/6, "Martin v. Löwis" <martin@v.loewis.de>:
We certainly should rely on the Windows behavior. The next question then is: What exactly *is* "the Windows behavior". Windows is not just inconsistent across versions, but apparently so even within a single version.
+1 for QOTW
IIUC, GetFileAttributes and FindFirstFile both claim that NUL exists, whereas GetFileAttributesEx claims that it doesn't exist, all in a single version, and all is Windows API.
Please understand that Python 2.4 *also* adheres to Windows behavior.
So, in Py2.4 we adhered to windows behaviour in one way, and in 2.5 we adhere to windows behaviour in other way. As Windows is inconsistant with itself, we got a behaviour change.... right? If yes, we have three paths to follow... leave 2.5 as is and say that the behaviour change is ok (windows fault), change 2.5 to use the same API than 2.4 and get the same behaviour, or hardwire the behaviour for this set of special files... What do you think we should do? Thanks! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
If yes, we have three paths to follow... leave 2.5 as is and say that the behaviour change is ok (windows fault), change 2.5 to use the same API than 2.4 and get the same behaviour, or hardwire the behaviour for this set of special files...
As you said before, we should avoid hardwiring things.
What do you think we should do?
I think we should try to follow the behavior of 2.4. To do that, we still have to find out what precisely the behavior of 2.4 is (and then perhaps we might decide to not follow it when we know what it is). Unfortunately, it seems that none of us is both capable and has sufficient time to research what the 2.4 behavior actually is; I'd like to emphasize that I think no changes should be made until the behavior is fully understood, which it currently isn't. So I suggest to take no action until somebody comes along who has both the time and the knowledge to resolve the issue. Regards, Martin
2007/11/6, "Martin v. Löwis" <martin@v.loewis.de>:
Unfortunately, it seems that none of us is both capable and has sufficient time to research what the 2.4 behavior actually is; I'd like to emphasize that I think no changes should be made until the behavior is fully understood, which it currently isn't.
So I suggest to take no action until somebody comes along who has both the time and the knowledge to resolve the issue.
Yes, I can try things on Windows, but don't have a development enviroment there, so I'll leave the bug open. Thanks everybody! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
""Martin v. Löwis"" <martin@v.loewis.de> wrote in message news:4730C271.8010908@v.loewis.de... |> If yes, we have three paths to follow... leave 2.5 as is and say that | > the behaviour change is ok (windows fault), change 2.5 to use the same | > API than 2.4 and get the same behaviour, or hardwire the behaviour for | > this set of special files... | | As you said before, we should avoid hardwiring things. | | > What do you think we should do? | | I think we should try to follow the behavior of 2.4. To do that, we | still have to find out what precisely the behavior of 2.4 is (and | then perhaps we might decide to not follow it when we know what it | is). | | Unfortunately, it seems that none of us is both capable and has | sufficient time to research what the 2.4 behavior actually is; | I'd like to emphasize that I think no changes should be made until | the behavior is fully understood, which it currently isn't. | | So I suggest to take no action until somebody comes along who | has both the time and the knowledge to resolve the issue. Perhaps a note should be added to the docs that the 'existence' of 'nul', etc, is inconsistent in Windows and hence, at present, in Python. In part, it seems to me, that anyone doing Windows-specific stuff should decide for themselves whether 'nul' exists for their purposes or not. Hence the problem should only arise when receiving a filename from an external source. Maybe the special Windows module should have an 'isdevice' function if it does not already to test such. I agree that there are more generally useful things for you to do. Windows is sometimes a maddening platform to work on. tjr
Perhaps a note should be added to the docs that the 'existence' of 'nul', etc, is inconsistent in Windows and hence, at present, in Python.
That is a statement that I want to get better confirmation on also. What is the precise condition where Windows (or perhaps just Python?) would claim that nul exists?
In part, it seems to me, that anyone doing Windows-specific stuff should decide for themselves whether 'nul' exists for their purposes or not.
No. The intention of os.path.exists clearly is that if it says that it exists, there is a high chance that you can subsequently open it, and vice versa. Of course, there are other issues, such as timing and permissions, but in general, Python applications should not have to "bypass" the standard library with platform-specific knowledge. It's somewhat unfortunate that os.path.exists() is implemented on top of stat(2), which does much more than finding out whether the file exists. Alas, it is the POSIX tradition that this "ought" to work, so this strategy is just a fact of life. I don't know what the actual use case is for testing "existence" of "nul", but I guess a "natural" problem is that the user says she wants to create a file named "nul", and the application checks whether this is a new file name, which it isn't (it exists in every directory, if I understand correctly). Regards, Martin
Martin v. Löwis schreef:
That doesn't really answer the question, though - you merely state that Python 2.4 calls the CRT, but then my question is still what kernel32 functions are called to have stat on NUL succeed.
Sure - but what does stat then do when passed NUL?
AFAIK then it doesn't fill in the size and time fields of the structure (or sets them to a useless/invalid value). (See http://msdn2.microsoft.com/en-us/library/14h5k7ff(vs.71).aspx)
GetFileAttributes() doesn't return those, just the FAT filesystem attributes. GetFileSize and GetFileTime fail.
Ok, so how does msvcrt stat() manage to fill these fields if those functions fail?
See above: if stat() (_stat() actually) is called on NUL (or another device), I don't think it does anything useful with these fields. -- The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom. -- Isaac Asimov Roel Schroeven
Sure - but what does stat then do when passed NUL?
AFAIK then it doesn't fill in the size and time fields of the structure (or sets them to a useless/invalid value).
(See http://msdn2.microsoft.com/en-us/library/14h5k7ff(vs.71).aspx)
What specifically on that page tells me how the fields get filled for NUL? If you are referring to the "if path refers to a device..." sentence: how does it determine that NUL is a device?
See above: if stat() (_stat() actually) is called on NUL (or another device), I don't think it does anything useful with these fields.
You mean, it does nothing documented... AFAICT from the code, it always fills in something. Regards, Martin
On 06/11/2007, "Martin v. Löwis" <martin@v.loewis.de> wrote:
See above: if stat() (_stat() actually) is called on NUL (or another device), I don't think it does anything useful with these fields.
You mean, it does nothing documented... AFAICT from the code, it always fills in something.
From my reading of the CRT source code, _stat() uses FindFirstFile(). This in turn appears to return a valid result on "nul" - win32api.FindFile, which is a thin wrapper round FindFirstFile etc, returns
win32api.FindFiles("nul") [(32, <PyTime:01/01/1601 00:00:00>, <PyTime:01/01/1601 00:00:00>, <PyTime:01/01/1601 00:00:00>, 0L, 0L, 0L, 0L, 'nul ', '')]
32 is FILE_ATTRIBUTE_ARCHIVE, the times are the epoch, and everything else is null. This is on my machine, using the Windows Server 2003 SP1 CRT source code. How consistent it is across versions, or anything else, I can't say :-( Paul.
From my reading of the CRT source code, _stat() uses FindFirstFile(). This in turn appears to return a valid result on "nul" - win32api.FindFile, which is a thin wrapper round FindFirstFile etc, returns
win32api.FindFiles("nul") [(32, <PyTime:01/01/1601 00:00:00>, <PyTime:01/01/1601 00:00:00>, <PyTime:01/01/1601 00:00:00>, 0L, 0L, 0L, 0L, 'nul ', '')]
Ok. I would still like to avoid calling FindFirstFile *first*, i.e. "normally" use GetFileAttributesEx first, and only fall back to FindFirstFile if that gives an error. Such fallback already occurs if the GetFileAttributesEx error was ERROR_SHARING_VIOLATION. So is there any good way to determine that the GetFileAttributesError was caused by using a "reserved" file name. It seems that the error is ERROR_INVALID_PARAMETER, but that would also be issued if you have an otherwise-invalid file name (e.g. one including wild cards), right?
This is on my machine, using the Windows Server 2003 SP1 CRT source code. How consistent it is across versions, or anything else, I can't say :-(
Thanks, that helps already. Regards, Martin
Martin v. Löwis schreef:
Sure - but what does stat then do when passed NUL? AFAIK then it doesn't fill in the size and time fields of the structure (or sets them to a useless/invalid value).
(See http://msdn2.microsoft.com/en-us/library/14h5k7ff(vs.71).aspx)
What specifically on that page tells me how the fields get filled for NUL? If you are referring to the "if path refers to a device..." sentence:
Yes, I was
how does it determine that NUL is a device?
I'm not sure. I suppose it just calls GetFileSize() etc. The docs for GetFileSize() say "You cannot use the GetFileSize function with a handle of a nonseeking device such as a pipe or a communications device. To determine the file type for hFile, use the GetFileType function." GetFileType() on its turn returns one of: FILE_TYPE_UNKNOWN: The type of the specified file is unknown. FILE_TYPE_DISK: The specified file is a disk file. FILE_TYPE_CHAR: The specified file is a character file, typically an LPT device or a console. FILE_TYPE_PIPE: The specified file is either a named or anonymous pipe. But I don't know where it is specified which names refer to devices. I tried to query all device names with QueryDosDevice() (with a very simple C program; I can give you the code though I doubt it's useful), but that returns 208 names (on my system); much more than the LPT1, CON, NUL etc. It also includes stuff like, on my system, "Dritek_KB_Filter", "Sony Ericsson T610 Series Bluetooth (TM) Modem #2" etc. I've tried calling _stat() on those names and it returns -1 meaning "File not found". That behavior is clearly different from CON, NUL etc. I thought I've seen the complete list on MSDN some time before, but I can't find it now.
See above: if stat() (_stat() actually) is called on NUL (or another device), I don't think it does anything useful with these fields.
You mean, it does nothing documented... AFAICT from the code, it always fills in something.
Yes, it returns 0xffffffff in the time fields and 0 in the size field (at least on my system, Windows XP SP1). I made another small C++ program to see what _stat() does (again, I can give you the code if you want). With a normal file: $ ./stat stat.cpp gid 0 atime 1194169674 = 2007-11-04T10:47:54 ctime 1194167463 = 2007-11-04T10:11:03 dev 2 ino 0 mode 0x81b6 mtime 1194381734 = 2007-11-06T21:42:14 nlink 1 rdev 2 size 1342 uid 0 With a device: $ ./stat NUL gid 0 atime 4294967295 = invalid time ctime 4294967295 = invalid time dev 2 ino 0 mode 0x81b6 mtime 4294967295 = invalid time nlink 1 rdev 2 size 0 uid 0 (The $ and ./ is because I ran the program from an msys shell) (it says "invalid time" because localtime() returns NULL) In summary, I'm afraid all of this doesn't really help very much... -- The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom. -- Isaac Asimov Roel Schroeven
Martin v. Löwis wrote:
So, the question is what we should do?:
Before this question can be answered, I think we need to fully understand what precisely is happening in 2.4, and what precisely is happening in 2.5.
Seeing this thread drag on was enough to get me to find out what changed. The implementation of "os.stat" has been changed. In 2.4.4, _stati64/_wstati64 were called directly, but in 2.5.1, a new pair of functions were wrote win32_stat/win32_wstat. _stati64/_wstati64 (as others have noted) fallback onto the use of FindFirstFile. win32_stat/win32_wstat use some functions called Py_GetFileAttributesExA/Py_GetFileAttributesExW which ultimately use GetFileAttributesA/GetFileAttributesW. The change to this implementation is r42230 with the Misc/NEWS comment saying: - Use Win32 API to implement os.stat/fstat. As a result, subsecond timestamps are reported, the limit on path name lengths is removed, and stat reports WindowsError now (instead of OSError). As to the specifics of what FindFirstFile* does with the values, I tested this quickly with ctypes on 'nul' (or any of the other special files): cAlternameFileName: cFileName: nul dwFileAttributes: 32 dwReserved0: 0 dwReserved1: 0 ftCreationTime: (dwLowDateTime: 0, dwHighDateTime: 0) ftLastAccessTime: (dwLowDateTime: 0, dwHighDateTime: 0) ftLastWriteTime: (dwLowDateTime: 0, dwHighDateTime: 0) nFileSizeHigh: 0 nFileSizeLow: 0 In order to keep the higher accuracy timestamps for normal files and to maintain the old behavior, my recommendation would be that the existing implementation of win32_stat/win32_wstat be extended to use FindFileFirst if GetFileAttributes* fails. I would be willing to do the legwork for such a patch if everyone agrees this is the appropriate solution. * As an aside, Martin, I find the argument that "hard-wiring is bad" to be against what is actually occurring in the posixmodule. For that matter, the S_IFEXEC flag is hardwired to path in (*.bat, *.cmd, *.exe, *.com) despite the fact that the platform says it is really anything in the list of os.getenv('PATHEXT'), but I suppose that is a bug for another day. -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
In order to keep the higher accuracy timestamps for normal files and to maintain the old behavior, my recommendation would be that the existing implementation of win32_stat/win32_wstat be extended to use FindFileFirst if GetFileAttributes* fails. I would be willing to do the legwork for such a patch if everyone agrees this is the appropriate solution.
That, in general, might be wrong. os.stat("*.txt") should fail, even though FindFirstFile will succeed when passed that string. That was my primary motivation for not using FindFirstFile by default: it may succeed even if the string being passed does not denote a file name.
* As an aside, Martin, I find the argument that "hard-wiring is bad" to be against what is actually occurring in the posixmodule. For that matter, the S_IFEXEC flag is hardwired to path in (*.bat, *.cmd, *.exe, *.com) despite the fact that the platform says it is really anything in the list of os.getenv('PATHEXT'), but I suppose that is a bug for another day.
No. See the source of the C library - the algorithm Python uses now is (or should be) the same as the one of the C library. Of course, you may argue that then msvcrt has the same bug. Regards, Martin
Martin v. Löwis wrote:
In order to keep the higher accuracy timestamps for normal files and to maintain the old behavior, my recommendation would be that the existing implementation of win32_stat/win32_wstat be extended to use FindFileFirst if GetFileAttributes* fails. I would be willing to do the legwork for such a patch if everyone agrees this is the appropriate solution.
That, in general, might be wrong. os.stat("*.txt") should fail, even though FindFirstFile will succeed when passed that string.
Sorry, I meant to imply that we would guard FindFirstFile in the same manner that _stati64 and friends do already (using strpbrk/wcspbrk to search for "?*" in the path). At that point, you have essentially duplicated the CRT code with the added improvement of using GetFileAttributes* to retrieve the high-precision timestamps. So, I think my opinion has changed now to say: first, use GetFileAttributes*, and if that fails use _stati64/_wstati64.
That was my primary motivation for not using FindFirstFile by default: it may succeed even if the string being passed does not denote a file name.
While I understand your reasoning, I thought we were letting the platform decide what are and are not files. This bug appeared because we are imposing our own notion of what a file is or is not, probably only by ignorance of the differences of GetFileAttribute* and FindFirstFile. So, my suggestion is basically a compromise of keeping higher precision timestamps for paths where GetFileAttribute* works and retaining the old behavior for all others.
* As an aside, Martin, I find the argument that "hard-wiring is bad" to be against what is actually occurring in the posixmodule. For that matter, the S_IFEXEC flag is hardwired to path in (*.bat, *.cmd, *.exe, *.com) despite the fact that the platform says it is really anything in the list of os.getenv('PATHEXT'), but I suppose that is a bug for another day.
No. See the source of the C library - the algorithm Python uses now is (or should be) the same as the one of the C library. Of course, you may argue that then msvcrt has the same bug.
I concede and apologies, I didn't read the code for converting attributes to mode flags. I don't imagine (m)any people care about this flag on the win32 platform anyways. -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
Sorry, I meant to imply that we would guard FindFirstFile in the same manner that _stati64 and friends do already (using strpbrk/wcspbrk to search for "?*" in the path). At that point, you have essentially duplicated the CRT code with the added improvement of using GetFileAttributes* to retrieve the high-precision timestamps. So, I think my opinion has changed now to say: first, use GetFileAttributes*, and if that fails use _stati64/_wstati64.
No. We want to phase out usage of the standard C library wherever we can. Duplicating its logic is fine. Also, I don't think Python should implement its own logic of what a valid file name is - the approach of using strpbrk is flawed. IIRC, some version of the CRT (or some other C library) used GetFileAttributes to determine whether a file name is valid.
While I understand your reasoning, I thought we were letting the platform decide what are and are not files. This bug appeared because we are imposing our own notion of what a file is or is not, probably only by ignorance of the differences of GetFileAttribute* and FindFirstFile. So, my suggestion is basically a compromise of keeping higher precision timestamps for paths where GetFileAttribute* works and retaining the old behavior for all others.
Sure, but I really dislike the string parsing that the CRT does (and I don't want to go back to using the CRT for stat-like calls). Regards, Martin
participants (11)
-
"Martin v. Löwis"
-
Chris Mellon
-
Eric Smith
-
Facundo Batista
-
Fred Drake
-
Mark Hammond
-
Paul Moore
-
Roel Schroeven
-
Scott Dial
-
Terry Reedy
-
Thomas Heller