python 2.7.12 on Linux behaving differently than on Windows

Steve D'Aprano steve+python at
Wed Dec 7 09:34:07 EST 2016

On Wed, 7 Dec 2016 11:23 pm, BartC wrote:

> On 07/12/2016 05:15, Steven D'Aprano wrote:
>> On Wednesday 07 December 2016 12:55, BartC wrote:
>>> But even Linux's 128KB will fill if someone wanted a command line that
>>> listed 20,000 files individually. But it would be spectacularly bad use
>>> of a command line interface which was designed for humans.
>> That's EXACTLY the point of having the shell do globbing.
> Sorry, it's got NOTHING to do with it!
> Here's a command that works on all one million files in the current path:
>   program *.*

No, sorry, you're wrong. That specific program is buggy, and its
implementation of globbing is unreliable. I think you meant this one:

    prog *.*

except, no, that one uses regular expressions, because the author is an
opinionated PITA who dislikes globs. So you would have to write:

    prog .*?\..*

but that's okay, you remembered that, didn't you? I hope so, otherwise it
will BSOD if you call it with an invalid regex.

Perhaps you meant to say

   appl *.*

but only version 4.7 or higher, it didn't get support for globbing before

(If you're going to make up imaginary programs that behave the way you
expect, I'm allowed to do the same.)

> 11 characters, so Linux still has 131061 left (assuming 'program ' is
> part of it).

I don't know why you are so hung up over the number of characters here, or
this bogeyman of "one million files" in a directory. Who does such a thing?
Most sys admins will go their entire career without ever needing to process
a million files in a single directory, or even split over a dozen

But for those who do need to do something like that, there are ways to deal
with it. Nobody (except in your imagination) says that the Unix way of
doing things is optimal for all possible scenarios. As I keep saying, and
you keep ignoring, the Unix way is the product of forty years of directed
evolution by people who have chosen this way because it works best for 90,
95% of what they do. Of course there are exceptions.

I understand that *your* needs are different. I've acknowledged that
repeatedly. And I've repeatedly told you how, for the sake of just a few
extra keystrokes, you can get the results you want on Linux:

- use backslashes to escape individual metacharacters;
- quote the arguments;

(but beware that "x" and 'x' are different in bash, and I'm afraid I'm not
expert enough to remember what the differences are. I just always use "x"
and hope :-)

- disable globbing, using "set -o noglob" in bash.

We get it that your needs are not those of the typical Unix sys admin. Unix
command line tools are designed for Unix sys admins, not you. But they're
not idiots, despite what you apparently think, and provide a way to bail
out of the normal pre-processing when needed.

> In Windows, argc is 2 (name of program, and the "*.*" parameter). If the
> program wants to process all one million files, it will have to scan
> them one at a time, but it does not need to build a list first.

If you want to pass a file spec to a Unix program, all you need do is quote
the file spec:

ls "*.*"

But it won't do you any good. I don't know any Unix programs that provide
file spec processing. (That's not to say that there are absolutely none,
only that I don't know of any.) That's simply not considered a necessary or
useful feature. But if you want to provide that feature in your own
programs, go right ahead.

But... how does your program distinguish between the file spec *.* which
should be expanded, and the literal file *.* which shouldn't be expanded?

You need an escape mechanism too. You've done that, of course. Right?

>>     command fooba?.jpg {here,there}/*.jpg  another/place/*.{png,jpg}
>>     [a-z]/IMG*
>    list fooba?.jpg {here,there}/*.jpg  another/place/*.{png,jpg} \
>      [a-z]/IMG* > files
>    command @files

*Requiring* this would be completely unacceptable. Forcing the user to write
two lines (one to create a variable, then to use it) every time they needed
a glob expansion would go down among sys admins like a lead balloon.

But for those times when you really need to repeat a complex set of
arguments, bash has you covered:

[steve at ando ~]$ export spec="a* z*"
[steve at ando ~]$ echo "$spec"
a* z*
[steve at ando ~]$ ls $spec
aaa-bbb.odt  addresses        ararat        arpaio.txt      aß         zip

“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

More information about the Python-list mailing list