Following up on this, in theory the right way to walk a tree using pathlib already exists, it's the rglob() method. E.g. all paths under /foo/bar should be found as follows:
for path in pathlib.Path('/foo/bar').rglob('**/*'): print(path)
The PermissionError bug you found is already reported: http://bugs.python.org/issue24120 -- it even has a patch but it's stuck in review.
Sadly there's another error: loops introduced by symlinks cause infinite recursion. I filed that here: http://bugs.python.org/issue26012. (The fix should be judicious use of is_symlink(), but the code is a little convoluted.)
On Mon, Dec 28, 2015 at 11:25 AM, Chris Barker email@example.com wrote:
On Tue, Dec 22, 2015 at 4:23 PM, Guido van Rossum firstname.lastname@example.org wrote:
The two-level iteration forced upon you by os.walk() is indeed often unnecessary -- but handling dirs and files separately usually makes sense,
indeed, but not always, so a simple API that allows you to get a flat walk would be nice....
Of course for that basic use case, you could just write your own wrapper
sure, but having to write "little" wrappers for common needs is unfortunate...
The problem isn't designing a nice walk API; it's integrating it with
indeed -- I'd really like to see a *walk in pathlib itself. I've been trying to use pathlib whenever I need, well, a path, but then I find I almost immediately need to step out and use an os.path function, and have to string-fy it anyway -- makes me wonder what the point is..
And honestly, if open, os.walk, etc. aren't going to work with Path
but they should -- of course they should.....
Truly pushing for adoption of a new abstraction like this takes many years
-- pathlib was new (and provisional) in 3.4 so it really hasn't been long enough to give up on it. The OP hasn't!
it will take many years for sure -- but the standard library cold at least adopt it as much as possible.
Path.walk would be a nice start :-)
My example: one of our sysadmins wanted a little script to go thorugh an entire drive (Windows), and check if any paths were longer than 256 characters (Windows, remember..)
I came up with this:
def get_all_paths(start_dir='/'): for dirpath, dirnames, filenames in os.walk(start_dir): for filename in filenames: yield os.path.join(dirpath, filename)
too_long =  for p in get_all_paths('/'): print("checking:", p) if len(p) > 255: too_long.append(p) print("Path too long!")
way too wordy!
I started with pathlib, but that just made it worse.
now that I think about it, maybe I could have simpily used pathlib.Path.rglob....
However, when I try that, I get a permission error:
/Users/chris.barker/miniconda2/envs/py3/lib/python3.5/pathlib.py in wrapped(pathobj, *args)
369 @functools.wraps(strfunc) 370 def wrapped(pathobj, *args):
--> 371 return strfunc(str(pathobj), *args) 372 return staticmethod(wrapped) 373
PermissionError: [Errno 13] Permission denied: '/Users/.chris.barker.xahome/caches/opendirectory'
as the error comes insider the rglob() generator, I'm not sure how to tell it to ignore and move on....
os.walk is somehow able to deal with this.
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception