
On Sun, Jan 4, 2015, at 16:02, Ethan Furman wrote:
On 01/04/2015 11:24 AM, random832@fastmail.us wrote:
If being cross-platform isn't easy, it won't happen. You see it now with the lack of any support for "call glob on arguments on windows and not on unix" [because the shell handles it on unix] whether directly, in argparse, or in fileinput
Could you elaborate on this point?
On Unix, as I assume you know, the shell is responsible for interpreting the kind of wildcard patterns that glob uses (plus shell-specific extensions), and passing a list of proper filenames as the child process's argv. On Windows, this does not happen - the child process is simply passed a single string with the whole command line. Python (or the C Runtime Library that the python interpreter is linked against) converts this to a list of strings for individual arguments based on spaces and quotes, but does not interpret wildcard patterns in any of the arguments. The C Runtime Library _can_ do this automatically - this is done on MSVC by linking the "setargv.obj" library routine, which replaces a standard internal routine that does not expand wildcards, but it does a poor job because it does not know which arguments are intended to be filenames vs other strings, and there is traditionally no way to escape them [since * and ? aren't allowed in filenames, there's no reason not to allow "some directory\*.txt", all in quotes, as an argument that will be handled as a wildcard] The appropriate place to expand them would be after you know you intend to treat a list of arguments as a list of filenames, rather than at program start - after options are parsed, for example (so an option with an argument with an asterisk in it doesn't get turned into multiple arguments), or if a list is being passed in to the fileinput module. This should also only be done on windows, and not on other platforms (since on other platforms this is supposed to be handled by the shell rather than the child process). Right now, none of this is done. If you pass *.txt on the command line to a python script, it will attempt to open a file called "*.txt". ---- Another separate but related issue is the fact that windows wildcards do not behave in the same way as python glob patterns. Bracketed character classes are not supported, and left bracket is a valid character in filenames unlike ? and *. There are some subtleties around dots ("*.*" will match all filenames, even with no dot. "*." matches filenames without any dot.), they're case-insensitive (I think glob does handle this part, but not in the same way as the platform in some cases), and they can match the short-form alternate filenames [8 characters, dot, 3 characters], so "*.htm" will typically match most files ending in ".html" as well as those ending in ".htm". It might be useful to provide a way to make glob behave in the windows-specific way (using the platform-specific functions FindFirstFileEx and RtlIsNameInExpression on windows.)