[Python-Dev] PEP 471 "scandir" accepted

Ben Hoyt benhoyt at gmail.com
Tue Jul 22 04:27:09 CEST 2014


> I asked privately Guido van Rossum if I can be the BDFL-delegate for
> the PEP 471 and he agreed. I accept the latest version of the PEP:
>
>     http://legacy.python.org/dev/peps/pep-0471/

Thank you!

> The PEP also explicitly mentions that os.walk() will be modified to
> benefit of the new os.scandir() function.

Yes, this was a good suggestion to include that explicitly -- in
actual fact, speeding up os.walk() was my main goal initially.

> The PEP is accepted.

Superb. Could you please update the PEP with the Resolution and
BDFL-Delegate fields?

> It's time to review the implementation ;-) The current code can be found at:
>
>    https://github.com/benhoyt/scandir
>
> (I don't think that Ben already updated his implementation for the
> latest version of the PEP.)

I have actually updated my GitHub repo for the current PEP (did this
last Saturday). However, there are still a few open issues, the main
one is that my scandir.py module doesn't handle the bytes/str thing
properly.

I intend to work on the CPython implementation over the next few
weeks. However, a couple of thoughts up-front:

I think if I were doing this from scratch I'd reimplement listdir() in
Python as "return [e.name for e in scandir(path)]". However, I'm not
sure this is a good idea, as I don't really want listdir() to suddenly
use more memory and perform slightly *worse* due to the extra DirEntry
object allocations.

So my basic plan is to have an internal helper function in
posixmodule.c that either yields DirEntry objects or strings. And then
listdir() would simply be defined something like "return
list(_scandir(path, yield_strings=True))" in C or in Python.

My reasoning is that then there'll be much less (if any) code
duplication between scandir() and listdir().

Does this sound like a reasonable approach?

-Ben


More information about the Python-Dev mailing list