[Python-ideas] Adding '**' recursive search to glob.glob
Steven D'Aprano
steve at pearwood.info
Mon Jan 14 17:24:20 CET 2013
On 15/01/13 02:46, Vinay Sajip wrote:
> Paul Moore<p.f.moore at ...> writes:
>
>> I'd like it if the glob module supported the (relatively common)
>> facility to use ** to mean recursively search subdirectories.
+1
>> One obvious downside is that if used carelessly, it can make globbing
>> pretty slow. So I'd propose that it be added as an optional extension
>> enabled using a flag argument (glob(pat, allow_recursive=True)) which
>> is false by default. That would also mean that backward compatibility
>> should not be an issue.
>
> Isn't the requirement to recurse implied by the presence of '**' in the
> pattern? What's to be gained by specifying it using allow_recursive as well?
Not necessarily. At the moment, a glob like "/**/spam" is equivalent to
"/*/spam":
[steve at ando /]$ touch /tmp/spam
[steve at ando /]$ mkdir /tmp/ham
[steve at ando /]$ touch /tmp/ham/spam
[steve at ando /]$ python3.3 -c "import glob; print(glob.glob('/**/spam'))"
['/tmp/spam']
With the suggested new functionality, the meaning of the glob will change.
From a backwards-compatibility point of view, one might not want to enable
the new semantics by default. But, from a *future*-compatibility point of
view, I don't know that it is a good idea to require a flag every time a
new globbing feature is added.
glob.glob(pattern, allow_recurse=True, allow_spam=True, allow_ham=True, allow_eggs=True, ...)
Rather than a flag, I suggest a version number:
glob.glob(pattern, version=1) # current behaviour, as of 3.3
glob.glob(pattern, version=2) # adds ** recursion in Python 3.4
Then in Python 3.5 or 3.6 support for version 1 globs could be dropped.
> Will having allow_recursive=True have any effect if '**' is not in the
> pattern?
I would expect that it will not have any effect unless ** is present.
After all, it simply allows ** to recurse, and no other glob
metacharacter can recurse.
>If you specify a pattern with '**' and allow_recursive=False, does
> that mean that '**' effectively acts as '*' would (i.e. one directory level
> only)?
I expect that without allow_recursive=True, ** would behave identically to
a single *
--
Steven
More information about the Python-ideas
mailing list