[Python-Dev] Updates to PEP 471, the os.scandir() proposal

Ben Hoyt benhoyt at gmail.com
Wed Jul 9 15:22:41 CEST 2014

> Option 2:
> def log_err(exc):
>     logger.warn("Cannot stat {}".format(exc.filename))
> def get_tree_size(path):
>     total = 0
>     for entry in os.scandir(path, info='lstat', onerror=log_err):
>         if entry.is_dir:
>             total += get_tree_size(entry.full_name)
>         else:
>             total += entry.lstat.st_size
>     return total
> On this basis, #2 wins.

That's a pretty nice comparison, and you're right, onerror handling is
nicer here.

> However, I'm slightly uncomfortable using the
> filename attribute of the exception in the logging, as there is
> nothing in the docs saying that this will give a full pathname. I'd
> hate to see "Unable to stat __init__.py"!!!

Huh, you're right. I think this should be documented in os.walk() too.
I think it should be the full filename (is it currently?).

> So maybe the onerror function should also receive the DirEntry object
> - which will only have the name and full_name attributes, but that's
> all that is needed.

That's an interesting idea -- though enough of a deviation from
os.walk()'s onerror that I'm uncomfortable with it -- I'd rather just
document that the onerror exception .filename is the full path name.

One issue with option #2 that I just realized -- does scandir yield
the entry at all if there's a stat error? It can't really, because the
caller will except the .lstat attribute to be set (assuming he asked
for type='lstat') but it won't be. Is effectively removing these
entries just because the stat failed a problem? I kind of think it is.
If so, is there a way to solve it with option #2?

> OK, looks like option #2 is now my preferred option. My gut instinct
> still rebels over an API that deliberately throws information away in
> the default case, even though there is now an option to ask it to keep
> that information, but I see the logic and can learn to live with it.

In terms of throwing away info "in the default case" -- it's simply a
case of getting what you ask for. :-) Worst case, you'll write your
code and test it, it'll fail hard on any system, you'll fix it
immediately, and then it'll work on any system.


More information about the Python-Dev mailing list