[Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info

Ben Hoyt benhoyt at gmail.com
Mon May 13 14:25:08 CEST 2013


Okay, I've renamed my "BetterWalk" module to "scandir" and updated it
as per our discussion:

https://github.com/benhoyt/scandir/#readme

It's not yet production-ready, and is basically still in API and
performance testing stage. For instance, the underlying scandir_helper
functions don't even return iterators yet -- they're just glorified
versions of os.listdir() that return an additional d_ino/d_type
(Linux) or stat_result (Windows).

In any case, I really like the API (thanks mostly to Nick Coghlan),
and performance is great, even with DirEntry being written in Python.

PERFORMANCE: On Windows I'm seeing that scandir.walk() on a large test
tree (see benchmark.py) is 8-9 times faster than os.walk(), and on
Linux it's 3-4 times faster. Yes, it is that much faster, and yes,
those numbers are real. :-)

Please critique away. At this stage it'd be most helpful to critique
any API or performance-related issues rather than coding style or
minor bugs, as I'm expecting the code itself will change quite a bit
still.

Todos:

* Make _scandir.scandir_helper functions return real iterators instead of lists
* Move building of DirEntry objects into C module, so basically the
entire scandir() is in C
* Add tests

-Ben


More information about the Python-Dev mailing list