[Python-ideas] find-like functionality in pathlib

Guido van Rossum guido at python.org
Tue Jan 5 00:25:29 EST 2016


Following up on this, in theory the right way to walk a tree using pathlib
already exists, it's the rglob() method. E.g. all paths under /foo/bar
should be found as follows:

  for path in pathlib.Path('/foo/bar').rglob('**/*'):
      print(path)

The PermissionError bug you found is already reported:
http://bugs.python.org/issue24120 -- it even has  a patch but it's stuck in
review.

Sadly there's another error: loops introduced by symlinks cause infinite
recursion. I filed that here: http://bugs.python.org/issue26012. (The fix
should be judicious use of is_symlink(), but the code is a little
convoluted.)

On Mon, Dec 28, 2015 at 11:25 AM, Chris Barker <chris.barker at noaa.gov>
wrote:

> On Tue, Dec 22, 2015 at 4:23 PM, Guido van Rossum <guido at python.org>
> wrote:
>
>> The two-level iteration forced upon you by os.walk() is indeed often
>> unnecessary -- but handling dirs and files separately usually makes sense,
>>
>
> indeed, but not always, so a simple API that allows you to get a flat walk
> would be nice....
>
> Of course for that basic use case, you could just write your own wrapper
>>> around os.walk:
>>>
>>
> sure, but having to write "little" wrappers for common needs is
> unfortunate...
>
> The problem isn't designing a nice walk API; it's integrating it with
>>> pathlib.*
>>
>>
> indeed -- I'd really like to see a *walk in pathlib itself. I've been
> trying to use pathlib whenever I need, well, a path, but then I find I
> almost immediately need to step out and use an os.path function, and have
> to string-fy it anyway -- makes me wonder what the point is..
>
>  And honestly, if open, os.walk, etc. aren't going to work with Path
>>> objects,
>>
>>
> but they should -- of course they should.....
>
> Truly pushing for adoption of a new abstraction like this takes many years
>> -- pathlib was new (and provisional) in 3.4 so it really hasn't been long
>> enough to give up on it. The OP hasn't!
>>
>
> it will take many years for sure -- but the standard library cold at least
> adopt it as much as possible.
>
> Path.walk would be a nice start :-)
>
> My example: one of our sysadmins wanted a little script to go thorugh an
> entire drive (Windows), and check if any paths were longer than 256
> characters (Windows, remember..)
>
> I came up with this:
>
> def get_all_paths(start_dir='/'):
>     for dirpath, dirnames, filenames in os.walk(start_dir):
>         for filename in filenames:
>             yield os.path.join(dirpath, filename)
>
> too_long = []
> for p in get_all_paths('/'):
>     print("checking:", p)
>     if len(p) > 255:
>         too_long.append(p)
>         print("Path too long!")
>
> way too wordy!
>
> I started with pathlib, but that just made it worse.
>
> now that I think about it, maybe I could have simpily used
> pathlib.Path.rglob....
>
> However, when I try that, I get a permission error:
>
> /Users/chris.barker/miniconda2/envs/py3/lib/python3.5/pathlib.py in
> wrapped(pathobj, *args)
>
>     369         @functools.wraps(strfunc)
>     370         def wrapped(pathobj, *args):
> --> 371             return strfunc(str(pathobj), *args)
>     372         return staticmethod(wrapped)
>     373
>
> PermissionError: [Errno 13] Permission denied:
> '/Users/.chris.barker.xahome/caches/opendirectory'
>
> as the error comes insider the rglob() generator, I'm not sure how to tell
> it to ignore and move on....
>
> os.walk is somehow able to deal with this.
>
> -CHB
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160104/a8003c3c/attachment-0001.html>


More information about the Python-ideas mailing list