[Python-ideas] BetterWalk, a better and faster os.walk() for Python

Ben Hoyt benhoyt at gmail.com
Mon Nov 26 09:01:00 CET 2012


> I'm suspicious of your use of Windows' built-in pattern matching. There are
> a number of quirks to it you haven't accounted for... for example: it
> matches short filenames, the behavior you noted of "?" at the end of
> patterns also applies to the end of the 'filename portion' (e.g. foo?.txt
> can match foo.txt), and the behavior of patterns ending in ".*" or "." isn't
> like fnmatch.

Oh, you're right. What a pain. The FindFirstFile docs are terrible in
this regard, and simply say "the file name, which can include wildcard
characters, for example, an asterisk (*) or a question mark (?)."
Microsoft documents * and ? at [1], but it's very incomplete and
doesn't mention those quirks. Any idea where there's thorough
documentation of it?

Oh, looks like someone's had a go here:
http://digital.ni.com/public.nsf/allkb/0DBE16907A17717B86256F7800169797

And this article by Raymond Chen looks related and interesting:
http://blogs.msdn.com/b/oldnewthing/archive/2007/12/17/6785519.aspx

Still, I think "pattern" is useful enough to get right (either that,
or drop it). It should be fairly straight-forward to find the patterns
that don't work and use wildcard='*' with fnmatch in those cases
instead.

> your find_data_to_stat function ignores the symlink flag

Yes, you're right. I haven't tested symlink handling in my code so
far. I intend to once I've got the speed issues ironed out though.

-Ben

[1] http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/find_c_search_wildcard.mspx?mfr=true



More information about the Python-ideas mailing list