* and ? in fnmatch

Hi all -- I have recently been playing with the fnmatch module, and learned that * and ? as considered by 'fnmatch.translate()' match *all* characters, including slashes, colons, backslashes -- in short, whatever happens to be "special" characters for pathnames on the current platform. In other words, "foo?bar.py" matches both "foo_bar.py" and "foo/bar.py". This is not the way any Unix shells that I know of work, nor is it how the wildcard-expanding MS-DOS system calls that I dimly remember from a decade or so back worked. I dunno how wildcard expansion is done under Windows nowadays, but I wouldn't expect * and ? to match colons or backslashes there any more than I expect them to match slash under a Unix shell. So is this a bug or a feature? Seems to me that a good fix would be to extend 'fnmatch.translate()' to have some (maybe all?) of the flags that the standard Unix library 'fnmatch()' supports. The flag in question here is FNM_PATHNAME, which is described in the Solaris manual as FNM_PATHNAME If set, a slash (/) character in string will be explicitly matched by a slash in pattern; it will not be matched by either the asterisk (*) or question-mark (?) special characters, nor by a bracket ([]) expression. If not set, the slash charac- ter is treated as an ordinary character. and in the GNU/Linux manual as FNM_PATHNAME If this flag is set, match a slash in string only with a slash in pattern and not, for example, with a [] - sequence containing a slash. To adapt this to Python's 'fnmatch.translate()', I think "slash" would have to be generalized to "special character", which is platform dependent: Unix / DOS/Windows : \ (and maybe / too?) Mac : I propose changing the signature of 'fnmatch.translate()' from def translate(pat) to at least def translate(pat,pathname=0) and possibly to def translate(pat, pathname=0, noescape=0, period=0, leading_dir=0, casefold=0) which follows the lead of GNU 'fnmatch()'. (Solaris 'fnmatch()' only supports the PATHNAME, NOESCAPE, and PERIOD flags; the GNU man page says LEADING_DIR and CASEFOLD are GNU extensions. I like GNU extensions.) Similar optional parameters would be added to 'fnmatch()' and 'fnmatchcase()', possibly dropping the 'casefold' argument since it's covered by which function you're calling. I have yet to fully grok the meaning of those other four flags, though, so I'm not sure how easy it would be to hack them into 'fnmatch.translate()'. Opinions? Greg

It's a feature. As I recall, I carefully implemented the standard fnmatch() as it existed 10 years ago. Use the glob module for matching Unix pathname syntax -- or use os.path.split(). I'm not wildly fond of the GNU way of adding 10 more options to each function... KISS. (And yes, I'm in a bad mood today. :-( ) --Guido van Rossum (home page: http://www.python.org/~guido/)

this is documented behaviour: Note that the filename separator ('/' on Unix) is not special to this module. See module glob for pathname expansion (glob uses fnmatch() to match filename segments).
sure looks like feature creep to me. is anyone actually using this module directly? </F>

It's a feature. As I recall, I carefully implemented the standard fnmatch() as it existed 10 years ago. Use the glob module for matching Unix pathname syntax -- or use os.path.split(). I'm not wildly fond of the GNU way of adding 10 more options to each function... KISS. (And yes, I'm in a bad mood today. :-( ) --Guido van Rossum (home page: http://www.python.org/~guido/)

this is documented behaviour: Note that the filename separator ('/' on Unix) is not special to this module. See module glob for pathname expansion (glob uses fnmatch() to match filename segments).
sure looks like feature creep to me. is anyone actually using this module directly? </F>
participants (3)
-
Fredrik Lundh
-
Greg Ward
-
Guido van Rossum