[Python-Dev] Updates to PEP 471, the os.scandir() proposal

Ethan Furman ethan at stoneleaf.us
Wed Jul 9 18:35:04 CEST 2014

On 07/09/2014 08:35 AM, Ben Hoyt wrote:
>>> One issue with option #2 that I just realized -- does scandir yield the
>>> entry at all if there's a stat error? It
>>> can't really, because the caller will expect the .lstat attribute to be
>>> set (assuming he asked for type='lstat') but
>>> it won't be. Is effectively removing these entries just because the stat
>>> failed a problem? I kind of think it is. If
>>> so, is there a way to solve it with option #2?
>> Leave it up to the onerror handler.  If it returns None, skip yielding the
>> entry, otherwise yield whatever it returned
>> -- which also means the error handler should be able to set fields on the
>> DirEntry:
>>    def log_err(exc, entry):
>>        logger.warn("Cannot stat {}".format(exc.filename))
>>        entry.lstat.st_size = 0
>>        return True
> This is an interesting idea, but it's just getting more and more
> complex, and I'm guessing that being able to change the attributes of
> DirEntry will make the C implementation more complex.
> Also, I'm not sure it's very workable. For log_err above, you'd
> actually have to do something like this, right?
> def log_err(exc, entry):
>      logger.warn("Cannot stat {}".format(exc.filename))
>      entry.lstat = os.stat_result((0, 0, 0, 0, 0, 0, 0, 0, 0, 0))
>      return entry

I would imagine we would provide a helper function:

   def stat_result(st_size=0, st_atime=0, st_mtime=0, ...):
       return os.stat_result((st_size, st_atime, st_mtime, ...))

and then in onerror

       entry.lstat = stat_result()

> Unless there's another simple way around this issue, I'm back to
> loving the simplicity of option #1, which avoids this whole question.

Too simple is just as bad as too complex, and properly handling errors is rarely a simple task.  Either we provide a 
clean way to deal with errors in the API, or we force every user everywhere to come up with their own system.

Also, just because we provide it doesn't force people to use it, but if we don't provide it then people cannot use it.

To summarize the choice I think we are looking at:

   1) We provide a very basic tool that many will have to write wrappers
      around to get the desired behavior (choice 1)

   2) We provide a more advanced tool that, in many cases, can be used
      as-is, and is also fairly easy to extend to handle odd situations
     (choice 2)

More specifically, if we go with choice 1 (no built-in error handling, no mutable DirEntry), how would I implement 
choice 2?  Would I have to write my own CustomDirEntry object?


More information about the Python-Dev mailing list